Configuring a PC for Large Language Models (LLMs)

Published 24th Dec 2024 - 5 minute read

Table of Contents
CPU (Processor)
RAM (Memory)
GPU (Video/Graphics Card)
- Which GPU to pick for LLMs?
- How much VRAM (GPU memory) does LLMs need?
- Will multiple GPUs improve performance in LLMs?
- Nvidia or AMD GPU for LLMs?
- Do LLMS require a "professional-grade" GPU?
What storage drive(s) should I use for LLMs?

AI large language models are moving & advancing at a blistering pace, with new and improved models coming out relentlessly. Hardware to best power these models are likewise being developed and refined from Nvidia, AMD, and Intel, among others. The "best" hardware for any given model will vary with your exact circumstances, but there are some general guidelines to get you started.

CPU (Processor)

With LLM systems in mind, the exact model of CPU is less important than the platform it's on. Server-grade platforms like Intel Xeon or AMD EPYC are fantastic, but their availability & lead-times for the Australian market are often restrictive, so we tend to fall back to recommending AMD's Threadripper Pro platform. The 32-core 64-thread PRO CPU is a fair starting point unless you know your needs call for more or less.

Since it's not recommended to run LLM on a CPU, the CPU typically shouldn't play a large role, but if your other work relies more heavily on the CPU then it could still be an important consideration. If your workflow involves data collection, manipulation, or pre-processing, the CPU will be a critical component to select carefully. Lastly, the choice of platform will dictate other factors like maximum memory capacity, PCI-e lane count, I/O connectivity, and future upgrade paths.

RAM (Memory)

Nvidia recommends at least 2x the amount of system memory as there is total GPU VRAM. This accommodates full "memory pinning" to CPU space to facilitate efficient buffering.

GPU (Video/Graphics Card)

Applications utilising LLMs have been made possible entirely because of GPUs' amazing performance for this type of computational problem! This means the GPU does almost all the heavy lifting for LLMs, and will likely be where you want to spend a considerable amount of your system budget.

Which GPU to pick for LLMs?

Professional or Compute GPUs are recommended due to their focus on large capacities of VRAM coupled with their cooling solutions which are designed with server chassis and with multiple-GPU systems in mind while consumer-grade GPUs usually aren't any more. Examples such as Nvidia's RTX 6000 Ada 48GB, L40S 48GB, and H100 80GB (if global shortages ever allow).

How much VRAM (GPU memory) does LLMs need?

With LLMs, the total VRAM is often the limiting factor in what can be achieved. For instance, to serve a ~70b parameter model in its native precision nears 200GB of VRAM. Another example, Llama3-70b can be served with good performance in multi-user environments (small/mid-sized organisation) with 4x Nvidia 6000 Ada or L40S GPUs.

Will multiple GPUs improve performance in LLMs?

Absolutely, yes! LLM systems and frameworks make good use of multiple GPUs.

Nvidia or AMD GPU for LLMs?

Nvidia is the market leader for GPU computing and is largely responsible for the rapid development of AI so far. They've continued to innovate and produce significant generational improvements in their products.

Do LLMS require a "professional-grade" GPU?

Strictly speaking, no, but professional GPUs come with greater VRAM capacities per card than consumer-grade ones, making it a clear choice for purpose-built LLM systems.

What storage drive(s) should I use for LLMs?

With the CPU, RAM, & GPU often taking so much of the focus when it comes to configuring a PC for LLMs, the storage configuration can get left out, or is an afterthought when most of the budget is already spent... this can be a huge and costly mistake! If the storage is not able to keep up with the CPU, RAM, & GPU - it will create a bottleneck and then it will not matter how fast or capable other hardware is.

While there are a few different solutions for storage drives, the only one that makes sense for LLMs is NVMe SSDs.

NVMe M.2 SSDs: These currently come in two flavours on our website. As it stands, we have Gen4 drives, & Gen5 drives. Gen4 NVMe M.2 SSDs generally top out at around 7,000MB/s while the Gen5 NVMe M.2 SSD drives start in the 9,xxxMB/s range and go up to 14,000MB/s.
These M.2 drives also connect directly to the motherboard (which may limit the amount of M.2 drives you can have in total as some may only support 2-3) which frees up the case/chassis drive bays for future additions that you may want.

Don't forget, it's important that you are in control of your data backups at all times!

For professional LLM work, we encourage a two-drive setup with the capacity of each depending on your desired budget and storage requirements. Capacities of 2TB-8TB are encouraged where possible.

OS & Applications (NVMe M.2 SSD) - Should be large enough to house your operating system plus any other applications you require as part of your workflow (eg: 1TB+).
Project files (NVMe M.2 SSD) - Having your projects on their own drive ensures that in case your primary drive has an issue that requires an OS reinstall you won't need to worry about also losing the projects you need to work on.

External drives are a cause for concern as they are known to contribute to performance and stability issues. External drives should therefore only be used to move data to the PC - then work on it - and move it back onto an external drive if needed, but you should not be working directly from an external drive.

Need help with your Evatech PC?

Something still not right with your Evatech PC? We're standing by and our support team can assist you!

Contact Evatech Support

There's more help docs!

If this page didn't solve your problem, there's many more to view, and they're all very informative.

Evatech Help Docs