Wednesday, September 5, 2012

How to customize CPU frequency steps

Hey everybody,

here comes a brand new guide on an interesting topic: CPU frequency scaling. Not many of you may know that on newer generation Intel CPUs (Ivy-Sandy Bridge) the kernel governor may decide the most suitable frequency in a plethora of them. In my case, for example (i7 3720QM), I can range from 1200MHz to 3600MHz in steps of 100MHz. This might be exciting, but changing CPU frequency costs a little energy at every transition.
Furthermore it happens that in several high-end CPUs, which are the more power consuming too, they cut the lower boundary to 1200MHz, while the real minimum is 800MHz.
In this advanced guide I want to introduce you how to modify your BIOS to fully control CPU frequency steps.

First of all this is not a game. You might seriously introduce instability and crashes and whatever, so please, be sure to know what you are doing. This "how to" is tested for Sandy and Ivy Bridge CPUs, but should apply also to CoreDuo and later.

One of the requisite of this operation is how to recompile a kernel. If you don't know what that means nor you know how to do that, you'd better go out and find another way to mess up your linux distro :P.

I move my steps on a debian based machine, powered by linux Mint 13 Maya. If you are on Ubuntu or Debian you can follow this guide step by step; if you own another distro take care of translating instruction to your environment.

Here comes another annoying paragraph of knowledge, so you can fully understand what we are doing. Usually BIOS takes care to understand what your machine has plugged in, and to expose a useful interface to the kernel. This is usually done by the ACPI management structure. Sometimes it happens that either the BIOS or the kernel mess the things up so you have to put the hands under the hood and fix them. This is possible by bypassing the BIOS furnished interface with a custom built one, compiled directly into the kernel. Following this guide you'll: dump stuff from your running BIOS, add or modify sections to implement things (following ACPI standards) and put it in the kernel.

Ok, time to go to the business.
I suppose you have the tool to compile your own kernel, so I won't introduce them, we'll need msr-tools and iasl, so you can issue:

sudo apt-get install msr-tools iasl

the first is an utility to look in your CPU Model Specific Registers, the second a compiler to decompile and recompile Intel ACPI Source Language Code.

Now we need a tool to read accurately frequencies and states; get it here and install it.

The first step is to dump the multiplier limits of your CPU. I found a python script and added support for Sandy Bridge and Ivy Bridge registers, so you can download it here. Now run it:

sudo modprobe msr
sudo python read_msr.py --readmsr

you'll get an extended output: just look for max and min VIDs and FIDs

[cpu7] [CURRENT] FID:31 HID:0 DID:0 VID:0 
[cpu7] [TARGET]  FID:31 HID:0 DID:0 VID:0 
[cpu7] [HIGHEST] FID:26 (HID:0 DID:0) VID:0
[cpu7] [LOWEST]  FID:8 (HID:0 DID:0) VID:0 
[cpu7] [SLFM]    FID:0 VID:0 
[cpu7] [IDA]     FID:0 VID:0 
[cpu7] [CURRENTLY ACTIVE FEATURES] IDA:0 EIST:1


your answer might have little differences. You can see that the current VID and FID are higher than the maximum, this is because of turbo mod, so if possible do the test while running a heavy load process or a stress test (i.e. cpuburnutils). In this case 31 is the highest multiplier.

At this step you must know that SandyBridge and IvyBridge can't manage VIDs no more. So you won't be able to define voltages, while still you can set the frequencies.

The next step is (obviously) to dump and modify (aka mess up ;D) the BIOS. What we need is the current DSDT file, we can simply dump it by:

sudo cat /sys/firmware/acpi/tables/DSDT > dsdt.dat

but before you can read anything you have to decript it:

iasl -d dsdt.dat

now open it with your favourite text editor and snoop around. It might still look like a babylonian tablet to your eyes. But no fear, make a backup copy and edit the file. We want to find the CPU section, so make a search for "CPU". You'll find something like:


Scope (_PR)
    {
        Processor (CPU0, 0x01, 0x00000410, 0x06) {}
        Processor (CPU1, 0x02, 0x00000410, 0x06) {}
        Processor (CPU2, 0x03, 0x00000410, 0x06) {}
        Processor (CPU3, 0x04, 0x00000410, 0x06) {}
        Processor (CPU...

Ok, here is the right point to paste our new code. Be sure to edit it to your needs. We are going to add a general section which carries infos for all the CPUs, then apply them core per core.
The most important part is the control registers definition: in someway there is an ACPI CPU interface to set and monitor CPU frequency. This seems to be called _PCT. We will set register 0x199 as the set register (first line), but also as the monitor register (second line). This is suggested for Sandy and Ivy bridge, because CPU can scale autonomously if the load requires it: if checking the real status you'll get some unexpected transitions resulting in fault. The other CPU family shouldn't be able to do so, but if you experience strange behaviors use for both 0x199.
Then we define the p-states. for each state enter frequency, transition latency, command value, target value:
               freq   lat  lat   cmd    target
                MHz              FFVV   FFVV
Package (0x06){3100,,0x0A,0x0A,0x1f00,0x1f00}

Command and target fields must be the same, and they are made by casting to hexadecimal the FID and VID values, in that order. Be sure to add FIDs and VIDs supported by your CPU!
Here comes a quoted example, be sure to adapt the steps and the number of the CPUs:

DO NOT JUST CUT AND PASTE

  Scope (_PR)
    {
Name (PPC, 0x00)
    
Name (PCT, Package (0x02){ // Registers definition
            ResourceTemplate (){Register (FFixedHW, 0x4, 0x00, 0x199, ,)
            }, 
// Core Duo and later use here 0x198
            ResourceTemplate (){Register (FFixedHW, 0x4, 0x00, 0x199, ,)
            }
        })
Name (PSS, Package (0x06){ // <-- 06 is the number of P-states defined

// P-state definition      
                        Package (0x06){3600,,0x0A,0x0A,0x2400,0x2400}, // p-state 0
                        Package (0x06){3100,,0x0A,0x0A,0x1f00,0x1f00}, // p-state 1
                        Package (0x06){2600,,0x0A,0x0A,0x1a00,0x1a00}, // p-state 2
                        Package (0x06){1600,,0x0A,0x0A,0x1000,0x1000}, // p-state 3
                        Package (0x06){1200,,0x0A,0x0A,0x0c00,0x0c00}, // p-state 4
                        Package (0x06){ 800,,0x0A,0x0A,0x0800,0x0800}, // p-state 5
                })
    
        Processor (CPU0, 0x01, 0x00000410, 0x06) {
        Alias(PPC,_PPC)
        Alias(PCT,_PCT)
        Alias(PSS,_PSS)
        }
        Processor (CPU1, 0x02, 0x00000410, 0x06) {
        Alias(PPC,_PPC)
        Alias(PCT,_PCT)
        Alias(PSS,_PSS)
        }
        Processor (CPU2, 0x03, 0x00000410, 0x06) {
        Alias(PPC,_PPC)
        Alias(PCT,_PCT)
        Alias(PSS,_PSS)
        }
        Processor (CPU3, 0x04, 0x00000410, 0x06) {
        Alias(PPC,_PPC)
        Alias(PCT,_PCT)
        Alias(PSS,_PSS)
        }
        Processor (CPU4, 0x05, 0x00000410, 0x06) {
        Alias(PPC,_PPC)
        Alias(PCT,_PCT)
        Alias(PSS,_PSS)
        }
        Processor (CPU5, 0x06, 0x00000410, 0x06) {
        Alias(PPC,_PPC)
        Alias(PCT,_PCT)
        Alias(PSS,_PSS)
        }
        Processor (CPU6, 0x07, 0x00000410, 0x06) {
        Alias(PPC,_PPC)
        Alias(PCT,_PCT)
        Alias(PSS,_PSS)
        }
        Processor (CPU7, 0x08, 0x00000410, 0x06) {
        Alias(PPC,_PPC)
        Alias(PCT,_PCT)
        Alias(PSS,_PSS)
        }
    }



Add the code, properly edited, for each logic CPU you find listed in the original DSDT file. Once done  we'll want to compile it, and check it has no errors:

iasl -tc dsdtnew.dsl

I suggest not to try to fix warnings, since the original DSDT might be already bugged (LOL). If you find error out of you code Google is a good friend you can surely ask some help (or leave a comment).

Once finished you'll find a dsdt.hex file. Move a copy of that to the /include folder of your kernel sources. Now edit the kernel configuration, adding the name of the DSDT to be included in:

Power management and ACPI options>ACPI Support>"Custom DSDT Table file to include"

Compile the kernel! Reboot! And check you job with both i7z and cpufreq-info.

If you messed up things, boot a previous kernel, modify again dsdt and try it again.

For any help feel free to leave a comment.

Cheers!



PS: I have tested this on an IvyBridge CPU, any feedback will be appreciated!

7 comments:

  1. Hi

    I found your guide very interesting, but could you quantify some energy savings (powertop, ...)?

    In newer kernels there is also the related pstate driver, did you try something with this?

    ReplyDelete
  2. Hey, I tried a bit, but I need some more time to test. By the way I'm working on an extension of this workaround, stay tuned.

    Cheers

    ReplyDelete
  3. Hello,

    Thank you for your detailed explanation.
    I used your description to build my dsdt; I use a Sandybridge processor (i7-2760QM).
    After recompiling the kernel I notice that in dmesg that my new dsdt is applied, but the folder /sys/devices/system/cpu/cpu0/cpufreq is lost.
    Any idea what is the mistake I am doing.

    Thanks,
    Rajiv

    ReplyDelete
    Replies
    1. Hey,
      Probably you messed up something, so the driver does not load at all. Try to check it, and be sure to test first with the acpi-cpufreq driver before the new one, which actually sucks and messes up lot of stuff.

      Cheers

      Delete
    2. On a side note, I also have another problem now while I am compiling the dsdt. Here is my current dsdt:http://nopaste.info/e4659bdda0.html

      The error is the following:
      Package (0x06){800,,10,10,0x0800,0x0800},
      ^ Initializer list shorter than declared package length

      I don't see what the problem is.

      Delete
    3. try to put a space or a couple of empty brackets between those commas... sometimes the compiler gets really silly... =_=

      Delete
    4. Unfortunately, none of them work :-(
      i don't know what the error could be. If it wants me to define voltage levels; i need to be informed abt it.. but, it doesn't give me enough info.

      Delete

Loading...