English

Problems compiling - compile host thermal issues

Damn it! dmesg extract on the physical machine:

[23996.969553] CPU3: Core temperature above threshold, cpu clock throttled (total events = 37164)
[23996.969554] CPU2: Core temperature above threshold, cpu clock throttled (total events = 37164)
[23996.969556] CPU1: Package temperature above threshold, cpu clock throttled (total events = 54936)
[23996.969558] CPU0: Package temperature above threshold, cpu clock throttled (total events = 54936)
[23996.969559] CPU2: Package temperature above threshold, cpu clock throttled (total events = 54936)
[23996.969569] CPU3: Package temperature above threshold, cpu clock throttled (total events = 54936)
[23996.970540] CPU2: Core temperature/speed normal
[23996.970541] CPU3: Core temperature/speed normal
[23996.970543] CPU1: Package temperature/speed normal
[23996.970544] CPU0: Package temperature/speed normal
[23996.970545] CPU3: Package temperature/speed normal
[23996.970551] CPU2: Package temperature/speed normal
[24174.310051] mce: [Hardware Error]: Machine check events logged
[24743.737847] CPU3: Package temperature above threshold, cpu clock throttled (total events = 60579)
[24743.737848] CPU2: Package temperature above threshold, cpu clock throttled (total events = 60579)
[24743.737852] CPU1: Package temperature above threshold, cpu clock throttled (total events = 60579)
[24743.737854] CPU0: Package temperature above threshold, cpu clock throttled (total events = 60579)
[24743.739856] CPU3: Package temperature/speed normal
[24743.739868] CPU2: Package temperature/speed normal
[24743.739875] CPU1: Package temperature/speed normal
[24743.739876] CPU0: Package temperature/speed normal
[25044.631631] CPU1: Package temperature above threshold, cpu clock throttled (total events = 66516)
[25044.631632] CPU0: Package temperature above threshold, cpu clock throttled (total events = 66516)
[25044.631635] CPU2: Package temperature above threshold, cpu clock throttled (total events = 66516)
[25044.631636] CPU3: Package temperature above threshold, cpu clock throttled (total events = 66516)
[25044.635628] CPU2: Package temperature/speed normal
[25044.635630] CPU1: Package temperature/speed normal
[25044.635632] CPU3: Package temperature/speed normal
[25044.635633] CPU0: Package temperature/speed normal

But (look at the fan!!):

simon@laptop:~$ sensors
thinkpad-isa-0000
Adapter: ISA adapter
fan1:         257 RPM

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +101.0°C  (high = +105.0°C, crit = +105.0°C)
Core 0:         +90.0°C  (high = +105.0°C, crit = +105.0°C)
Core 1:        +101.0°C  (high = +105.0°C, crit = +105.0°C)

I’m looking into it, but I cannot reload the acpi kernel module as it is used. I’ll need a restart to see if I manually can increase fan speed.

So, fan control doesn’t work. I can write arbitrary levels to /proc/acpi/ibm/fan but still it stays at level 1 with 250rpm. So if temperature is really the issue I can only limit the CPUs of the vm (number of cores and ghz) to ensure that the physical CPUs don’t get too hot… :frowning:

Shot in the dark, but maybe it uses the pwm control mentioned e.g. here?

Maybe also:
http://www.thinkwiki.org/wiki/How_to_control_fan_speed

# echo level auto > /proc/acpi/ibm/fan

I don’t think so as cat-ing the proc-file works without problems. But to be honest, I’m no expert in this field…

simon@laptop:~$ cat /proc/acpi/ibm/fan 
status:		enabled
speed:		257
level:		1
commands:	level <level> (<level> is 0-7, auto, disengaged, full-speed)
commands:	enable, disable
commands:	watchdog <timeout> (<timeout> is 0 (off), 1-120 (seconds))
simon@laptop:~$ 

I tried auto as well as all other levels, it doesn’t change anything… :-/

It’s best to leave all that to the Thinkpad bios/EC I guess, but just for testing:

# rmmod thinkpad_acpi 
# modprobe thinkpad_acpi fan_control=1
# echo level 7 > /proc/acpi/ibm/fan
# echo level auto > /proc/acpi/ibm/fan

Update: Maybe this is only valid for the real IBM Thinkpads. I’m not sure if the Lenovo hardware is still the same. Maybe it’s even confusing your computer. Maybe updating the bios without any extra software will help?

I would have thought the same. I mean we’re not in 1994 anymore, are we?

Unfortunately I cannot unload the module as it is used (as mentioned earlier):

root@laptop:/home/simon# rmmod thinkpad_acpi
rmmod: ERROR: Module thinkpad_acpi is in use

Is there a way to check with witch attributes a kernel module was loaded (I just found ways to show available attributes)?

Well, I believe it used to work with my X220 (fun fact: until it melted away – but to be fair the whole kitchen burned down, so I cannot really blame the laptops fan…), so I believe Lenovo should in general still work.

Yeah, an Uefi update is on my list as well.

That one is only(?) for the Yoga 13, which is an Ideapad, while my Yoga 12 is a thinkpad. I thus hope that the IBM/Rhinkpad stuff still works…

btw: Thanks everybody for your support!

edit: in IRC I was told the temperature shouldn’t cause a segfault. So maybe the problem is somewhere else. A little frustrating… :-/

Check the files under:

/etc/modprobe.d/ and or cat /proc/modules

A lot of people tell a lot of things, me included :slight_smile: I would just try another computer … some hardware issues just show up under load.

But we are slowly getting off-topic here, so good luck for now!

Just another build failed, the build log again shows a segfault.

There is no thinkpad_acpi file under /etc/modprobe.d/ and it seems the module is loaded without the parameter:

simon@laptop:~$ cat /proc/modules | grep thinkpad
thinkpad_acpi 86016 1 - Live 0x0000000000000000

Easier said than done. :-/ The only other hardware I have laying around here is a RasPi and a Cubitruck… I might just wait until fp publishes a build if nobody has an idea how to find out the cause for my problems.

Is the acpid still running?

Looks like it:

simon@laptop:~$ ps aux | grep acpid
root       565  0.0  0.0      0     0 ?        S<   15:31   0:00 [ktpacpid]
root      2082  0.0  0.0   4396  1736 ?        Ss   15:31   0:00 /usr/sbin/acpid
simon    28534  0.0  0.0  15220  2256 pts/12   S+   21:04   0:00 grep acpid

I guess this will keep the module busy. Just stop it for unloading (I think…)
service acpid stop (is ubuntu using service? I more in the fedora/rhel region :wink: )

Klaus

Yes, but it seems there is still more active:

simon@laptop:~$ ps aux | grep acpid
root       565  0.0  0.0      0     0 ?        S<   15:31   0:00 [ktpacpid]
root      2082  0.0  0.0   4396  1736 ?        Ss   15:31   0:00 /usr/sbin/acpid
simon    28534  0.0  0.0  15220  2256 pts/12   S+   21:04   0:00 grep acpid
simon@laptop:~$ sudo service acpid stop
[sudo] Passwort fĂĽr simon: 
simon@laptop:~$ sudo rmmod thinkpad_acpi
rmmod: ERROR: Module thinkpad_acpi is in use
simon@laptop:~$ ps aux | grep acpid
simon      334  0.0  0.0  15216  2124 pts/12   S+   21:07   0:00 grep acpid
root       565  0.0  0.0      0     0 ?        S<   15:31   0:00 [ktpacpid]

Hmm… Might the kernel process ktpacpid also use the module? Honestly, no idea. Probably, no help short of rebooting…

Not sure if I get you right. Purely rebooting probably won’t help as acpid will get started again… First I would need to make sure that the module is loaded with the correct parameter. But I’m not sure which program/process starts it and where and how to pass the parameter “fan_control=1” to it. I would guess placing a file in /etc/modprobe.d/ might be the right way to do it, but I don’t understand the syntax the existing files use.

Ok, I see. Yes, this would be an entry in a file in /etc/modprobe.d
Best thing to first check whether there is already an entry present

If not, add a file acpid.conf, with one line

THis should do it.

Edit: add a depmod -a for good measure…

No such file was present.
Is it one line?

simon@laptop:~$ cat /etc/modprobe.d/acpid.conf
options acpid fan_control=1 depmod -a
simon@laptop:~$ 

Ah, sorry. The “depmod -a” does not belong to this file, but after creating the file issue the command

on the command line, as root.
That is, the one line content would be fine without depmod -a

did a reboot and it seems that I dismiproved it – or that we now know that fan control is not supported…

simon@laptop:~$ cat /proc/modules | grep thinkpad
thinkpad_acpi 86016 1 - Live 0x0000000000000000
nvram 16384 1 thinkpad_acpi, Live 0x0000000000000000
snd 81920 28 snd_usb_audio,snd_usbmidi_lib,snd_hda_codec_hdmi,snd_hda_codec_conexant,snd_hda_codec_generic,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_pcm,snd_rawmidi,snd_seq,snd_seq_device,snd_timer,thinkpad_acpi, Live 0x0000000000000000
video 36864 2 thinkpad_acpi,i915, Live 0x0000000000000000
root@laptop:/home/simon# ps aux | grep acpid
root       558  0.0  0.0      0     0 ?        S<   21:43   0:00 [ktpacpid]
root      2003  0.0  0.0   4396  1784 ?        Ss   21:43   0:00 /usr/sbin/acpid
root     20636  0.0  0.0  15220  2248 pts/2    S+   21:52   0:00 grep --color=auto acpid
root@laptop:/home/simon# echo "level 7" > /proc/acpi/ibm/fan 
bash: echo: Schreibfehler: Das Argument ist ungĂĽltig.

you can check whether your modifications did work:

there should be (probably among others) an entry fan_control. If you “cat” it, it should show a one.