MuTasim

Your Memory is Leaking

⌛ 11 min read

In this blog, we will explore memory leaks. I will primarily focus on JS, but other high-level programming languages are similar in this regard. They have a garbage collector that manages memory leakage and tries to prevent it as much as possible. You might think, "Hey, why talk about a problem that doesn't exist, unlike in low-level programming languages like C?" Well, hold on and let's learn something new today!

Symptoms

Yup, right, memory leaks have their own symptoms. And I'm sure one of them has happened to you (as long as you use a browser). Have you ever experienced a situation where the browser loads for a long time and then complains that the page is unresponsive, leaving you with only two choices: wait or quit? No? What about speed? Have you ever noticed the browser becoming so slow that you can't switch between pages normally? Or perhaps your operating system slows down too? Of course, because your browser gradually consumes more and more RAM from your computer. Yes, these are some symptoms that you can definitely observe due to memory leaks.

But how does this even happen in a language that has a garbage collector that takes care of the garbage in our code for us?

Garbage collector (GC):

What? Garbage collector? Yes, it's not a new concept; it's been around for a while in the world of computer science.

Back in the 1950s and 1960s, when computers were just starting to become popular, programmers had to manage memory manually. They had to keep track of every bit of memory they used and make sure to clean up after themselves. This was a tedious and error-prone process, leading to memory leaks and bugs in programs.

In 1959, the concept of automatic garbage collection became a thing. This was a way for computers to clean up unused memory automatically, without programmers having to do it manually. So yeah, it's a good DX(developer experience). It literally frees the programmer(or coders) from manually deallocating memory. Oh, it's just so unbelievably helpful that even the developers seem to have missed its existence. (LOL)

What's a garbage collector?

Let's simplify it as much as possible (I like using analogies). Imagine your computer's memory as a big box where it stores information. When you run a program or open a file, it needs some space in this box to work with. But sometimes, after you're done using a piece of information, your program might forget to clean up after itself. It's like leaving stuff lying around your room instead of putting it back where it belongs. This unused stuff takes up space in the memory box, making it harder for your computer to find room for new things.

That's where garbage collection comes in. It's like having a cleaning crew for your computer's memory. The garbage collector's job is to look through the memory box, find all the stuff that's no longer being used, and throw it away. This frees up space so your computer can use it for new things.

How does GC works?

I don't want to leave things on the surface; we've talked about "what" is GC, but what about "how" it actually works? This part is always interesting to me! So let's see how it works:

Mark and Sweep:

Mark and Sweep is one of the earliest and best-known garbage collection algorithms, and yes, JS uses this algorithm too(depending on the engine, ofc). So, how does it work? Here's how:

  • Mark: The garbage collector begins by marking all the memory still in use by the program. It starts from a set of known "roots" (like global variables or objects referenced by the program) and follows all the references from these roots to other objects. Any objects it reaches during this process are marked as "in use ✔️".
  • Sweep: Once all the reachable objects have been marked, the garbage collector sweeps through the memory and deallocates any unmarked objects (dead objects). These unmarked objects are considered garbage because they are no longer reachable from the roots and are not being used by the program.

Here's a picture to make it easier to understand:

Mark And Sweep Algorithm
the picture is provided to simplify understanding.

So by now, we have a clear idea about garbage collectors and their effectiveness. However, as is often the case, there are downsides to every approach, and why not talk about them?

Performance overhead and non-deterministic behavior can occasionally lead to pauses in program execution, which can be problematic for latency-sensitive applications. Memory fragmentation and resource consumption may also increase over time, especially in systems with large heaps or high allocation rates. Managing external resources, such as file handles, alongside memory, can be challenging and may result in leaks if not handled properly.

Despite these drawbacks, GC continues to be widely used due to its ability to simplify memory management.

Manually Memory management

Before we dive into the main topic, let's take a look at C. Unlike high-level languages, C doesn't have garbage collection. So, how does C handle memory management? As mentioned earlier, programmers have to handle memory allocation and deallocation themselves in their code. It's like giving instructions to the computer about when to grab some memory for data and when to let it go when it's not needed anymore.

Let's explain it in simple terms to understand it better. Imagine your computer's memory is like a big box of drawers. When you need to store some information, you ask the computer to give you a drawer (allocate memory). You use that drawer for as long as you need, then you tell the computer you're done with it and it can use it for something else (deallocate memory).

Now, let's take a look at an example in C code:

./example.c
#include <stdio.h>
#include <stdlib.h>

int main() {
    // Asking for a drawer (memory) to store an integer
    int *number = (int *)malloc(sizeof(int));

    // Putting a number inside the drawer
    *number = 10;

    // Printing the number
    printf("Number: %d\n", *number);

    // Now we are done with the drawer, so we give it back to the computer
    free(number);

    return 0;
}

In C, malloc() and free() are two important functions used for manual memory management.

  • malloc(): This function stands for "memory allocation". It's used to request a block of memory from the heap. When you call malloc(), you specify the amount of memory you need in bytes. It returns a pointer to the allocated memory if the allocation succeeds, or NULL if it fails.
  • free(): Once you're done using a block of memory allocated with malloc() (or related functions), you must return it to the system so it can be reused. This is where free() comes in. It takes a pointer to the memory block that you want to deallocate. After calling free(), that memory is no longer reserved for your program, and it can be used for other purposes.

Looks like a pain, doesn't it? Cleaning up your code after yourself manually. (ugh)

Why your memory is Leaking?

So we've talked about GC and we've talked about manually managing memory. But you might be thinking, "I don't code in C, nor do I manually manage memory, so why should I care about it when garbage collector does it for me?" Well, here's the thing - you still need to care about it!

The day will come when you'll find your memory leaking, and you'll be wondering why. It might seem impossible, and all you'll end up doing is restarting your computer, which isn't a real solution here.

Global Variables

Earlier, we discussed the mark and sweep algorithms, which mark active reachable objects from the root of the GC. But what can be the root for JS when we open the browser? Window is the main JavaScript object root, also known as the global object in a browser. So, having global variables such as window.x = 10 can cause memory leaks. The problem here is that it's obviously going to stay on the root because the root is accessing it, and the garbage collector thinks that it's always active since it sits on the root.

How can we prevent this? Easily, just by writing "use strict", which throws an error whenever you try to use such global variables.

Timers

Another problem that prevents objects from being garbage collected is having a setTimeout or a setInterval in your code. If we set such a timer in our code, the reference to the object from the timer's callback will stay active for as long as the callback is invocable. Let's look at an example:

./example.js
function setCallback() {
  const info = {
    count: 0,
    longText: new Array(100000).join('x')
  };

  return function printCount() {
    info.count++;
    console.log(info.count);
  };
}

setInterval(setCallback(), 1000);

In this example, the info object can be garbage collected only after the timer is cleared. Since we have no reference to setInterval, it can never be cleared, and info.longText is kept in memory until the app stops, although never used.

How can we prevent this? First, we should be aware of the objects referenced from the timer's callback. We should assign our timer to a variable and clear it when needed. For example:

./example.js
const timer = setInterval(setCallback(), 1000);
clearInterval(timer);

DOM References

In our tasks, we may often want to store DOM nodes inside a data structure. It's a normal practice to do so. For instance:

./example.js
const element = {
    button: document.querySelector('.btn')
}

When this happens, two references to the same DOM element are kept: one in the DOM tree and the other in the element object. If, at some point in the future, you decide to remove this button, it must be deleted from both places.

./example.js
function removeBtn() {
    document.querySelector('.btn').remove();
}

Here, we remove it from the DOM tree, but we still have a reference to ".btn" in the global element object. In other words, the button element is still in memory and cannot be collected by the GC. To prevent memory leaks in scenarios like this, it's important to ensure that you remove references to DOM elements when they are no longer needed. For example, you can set the reference in the element object to null after removing the element from the DOM:

./example.js
function removeBtn() {
    const button = document.querySelector('.btn');
    button.remove();
    element.button = null;
}

Cache

When caching data using a Map in JavaScript, a potential problem arises if the keys used in the map hold references to objects that are no longer needed elsewhere in the application. This can prevent those objects from being garbage collected, leading to memory leaks. Here's an example:

./example.js
let user_1 = { name: "Peter", id: 12345 };
let user_2 = { name: "Mark", id: 54321 };

const mapCache = new Map();

// Function to cache user data
function cache(obj) {
    if (!mapCache.has(obj)) {
        const value = `${obj.name} has an id of ${obj.id}`;
        mapCache.set(obj, value);
        return [value, 'computed'];
    }

    return [mapCache.get(obj), 'cached'];
}

console.log(cache(user_1)); // ['Peter has an id of 12345', 'computed']
console.log(cache(user_1)); // ['Peter has an id of 12345', 'cached']
console.log(cache(user_2)); // ['Mark has an id of 54321', 'computed']

user_1 = null; // Remove inactive user

console.log(mapCache); // map(2)=> { {name: "Peter", id: 12345}:"Peter has an id of 12345", {name: "Mark", id: 54321}:"Peter has an id of 54321"}

In this example, we've used a Map called mapCache to store user data. After caching user_1 and user_2, we simulate garbage collection by setting user_1 to null. In a Map, the key user_1 remains present even if it's set to null.

So how can we solve this issue? Here's where WeakMap comes in. But what is WeakMap? Maybe you've heard of it before. WeakMap is a data structure with weakly held key references, which accepts only objects as keys. If we use an object as the key, and it is the only reference to that object, the associated entry will be removed from cache and garbage collected.

In this example, we use WeakMap instead of Map, and when we set user_1 to null(effectively removing the inactive user), it gets automatically deleted from the WeakMap as well:

./example.js
let user_1 = { name: "Peter", id: 12345 };
let user_2 = { name: "Mark", id: 54321 };

const weakMapCache = new WeakMap();

function cache(obj) {
	// same as prev function, but with weakMapCache
}

console.log(cache(user_1)); // ['Peter has an id of 12345', 'computed']
console.log(cache(user_2)); // ['Mark has an id of 54321', 'computed']

user_1 = null; // Remove inactive user

console.log(weakMapCache); // map(1)=> { {name: "Mark", id: 54321}:"Peter has an id of 54321" }

Bonus

You can find memory leaks in your web application using Chrome DevTools. Here's how:

By following these steps, you can effectively identify and diagnose memory leaks in your web application using Chrome DevTools.

Conclusion

We've explored how memory leaks can occur in JS, despite its built-in system to clean up unused memory(GC). Understanding the causes of these leaks and how to prevent them is crucial for maintaining the efficiency of our programs and ensuring a smooth experience for users.

As developers, it's important to pay close attention to how memory is managed in our code. By taking proactive measures, such as identifying and resolving memory leaks, we can optimize the performance of our applications.

In the end, it's all about ensuring that users have a great experience(UX) when they use these apps. That's why understanding and managing memory properly is so important in web development.

Happy coding! 💛