sjjh
February 13, 2016, 2:24pm
1
Damn it! dmesg
extract on the physical machine:
[23996.969553] CPU3: Core temperature above threshold, cpu clock throttled (total events = 37164)
[23996.969554] CPU2: Core temperature above threshold, cpu clock throttled (total events = 37164)
[23996.969556] CPU1: Package temperature above threshold, cpu clock throttled (total events = 54936)
[23996.969558] CPU0: Package temperature above threshold, cpu clock throttled (total events = 54936)
[23996.969559] CPU2: Package temperature above threshold, cpu clock throttled (total events = 54936)
[23996.969569] CPU3: Package temperature above threshold, cpu clock throttled (total events = 54936)
[23996.970540] CPU2: Core temperature/speed normal
[23996.970541] CPU3: Core temperature/speed normal
[23996.970543] CPU1: Package temperature/speed normal
[23996.970544] CPU0: Package temperature/speed normal
[23996.970545] CPU3: Package temperature/speed normal
[23996.970551] CPU2: Package temperature/speed normal
[24174.310051] mce: [Hardware Error]: Machine check events logged
[24743.737847] CPU3: Package temperature above threshold, cpu clock throttled (total events = 60579)
[24743.737848] CPU2: Package temperature above threshold, cpu clock throttled (total events = 60579)
[24743.737852] CPU1: Package temperature above threshold, cpu clock throttled (total events = 60579)
[24743.737854] CPU0: Package temperature above threshold, cpu clock throttled (total events = 60579)
[24743.739856] CPU3: Package temperature/speed normal
[24743.739868] CPU2: Package temperature/speed normal
[24743.739875] CPU1: Package temperature/speed normal
[24743.739876] CPU0: Package temperature/speed normal
[25044.631631] CPU1: Package temperature above threshold, cpu clock throttled (total events = 66516)
[25044.631632] CPU0: Package temperature above threshold, cpu clock throttled (total events = 66516)
[25044.631635] CPU2: Package temperature above threshold, cpu clock throttled (total events = 66516)
[25044.631636] CPU3: Package temperature above threshold, cpu clock throttled (total events = 66516)
[25044.635628] CPU2: Package temperature/speed normal
[25044.635630] CPU1: Package temperature/speed normal
[25044.635632] CPU3: Package temperature/speed normal
[25044.635633] CPU0: Package temperature/speed normal
But (look at the fan!! ):
simon@laptop:~$ sensors
thinkpad-isa-0000
Adapter: ISA adapter
fan1: 257 RPM
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +101.0°C (high = +105.0°C, crit = +105.0°C)
Core 0: +90.0°C (high = +105.0°C, crit = +105.0°C)
Core 1: +101.0°C (high = +105.0°C, crit = +105.0°C)
I’m looking into it, but I cannot reload the acpi kernel module as it is used. I’ll need a restart to see if I manually can increase fan speed.
sjjh
February 13, 2016, 2:43pm
2
So, fan control doesn’t work. I can write arbitrary levels to /proc/acpi/ibm/fan
but still it stays at level 1 with 250rpm. So if temperature is really the issue I can only limit the CPUs of the vm (number of cores and ghz) to ensure that the physical CPUs don’t get too hot…
Shot in the dark, but maybe it uses the pwm control mentioned e.g. here ?
sjjh
February 13, 2016, 3:28pm
5
I don’t think so as cat
-ing the proc
-file works without problems. But to be honest, I’m no expert in this field…
simon@laptop:~$ cat /proc/acpi/ibm/fan
status: enabled
speed: 257
level: 1
commands: level <level> (<level> is 0-7, auto, disengaged, full-speed)
commands: enable, disable
commands: watchdog <timeout> (<timeout> is 0 (off), 1-120 (seconds))
simon@laptop:~$
Maybe also
echo level auto > /proc/acpi/ibm/fan
I tried auto as well as all other levels, it doesn’t change anything… :-/
It’s best to leave all that to the Thinkpad bios/EC I guess, but just for testing:
# rmmod thinkpad_acpi
# modprobe thinkpad_acpi fan_control=1
# echo level 7 > /proc/acpi/ibm/fan
# echo level auto > /proc/acpi/ibm/fan
Update: Maybe this is only valid for the real IBM Thinkpads. I’m not sure if the Lenovo hardware is still the same. Maybe it’s even confusing your computer. Maybe updating the bios without any extra software will help?
sjjh
February 13, 2016, 3:51pm
7
I would have thought the same. I mean we’re not in 1994 anymore, are we?
Unfortunately I cannot unload the module as it is used (as mentioned earlier):
root@laptop:/home/simon# rmmod thinkpad_acpi
rmmod: ERROR: Module thinkpad_acpi is in use
Is there a way to check with witch attributes a kernel module was loaded (I just found ways to show available attributes)?
Well, I believe it used to work with my X220 (fun fact: until it melted away – but to be fair the whole kitchen burned down, so I cannot really blame the laptops fan… ), so I believe Lenovo should in general still work.
Yeah, an Uefi update is on my list as well.
That one is only(?) for the Yoga 13, which is an Ideapad , while my Yoga 12 is a thinkpad . I thus hope that the IBM/Rhinkpad stuff still works…
btw: Thanks everybody for your support!
edit: in IRC I was told the temperature shouldn’t cause a segfault. So maybe the problem is somewhere else. A little frustrating… :-/
Check the files under:
/etc/modprobe.d/ and or cat /proc/modules
A lot of people tell a lot of things, me included I would just try another computer … some hardware issues just show up under load.
But we are slowly getting off-topic here, so good luck for now!
sjjh
February 13, 2016, 7:12pm
9
Just another build failed, the build log again shows a segfault.
There is no thinkpad_acpi file under /etc/modprobe.d/ and it seems the module is loaded without the parameter:
simon@laptop:~$ cat /proc/modules | grep thinkpad
thinkpad_acpi 86016 1 - Live 0x0000000000000000
Easier said than done. :-/ The only other hardware I have laying around here is a RasPi and a Cubitruck… I might just wait until fp publishes a build if nobody has an idea how to find out the cause for my problems.
lklaus
February 13, 2016, 8:03pm
10
Is the acpid still running?
sjjh
February 13, 2016, 8:05pm
11
Looks like it:
simon@laptop:~$ ps aux | grep acpid
root 565 0.0 0.0 0 0 ? S< 15:31 0:00 [ktpacpid]
root 2082 0.0 0.0 4396 1736 ? Ss 15:31 0:00 /usr/sbin/acpid
simon 28534 0.0 0.0 15220 2256 pts/12 S+ 21:04 0:00 grep acpid
lklaus
February 13, 2016, 8:06pm
12
I guess this will keep the module busy. Just stop it for unloading (I think…)
service acpid stop (is ubuntu using service? I more in the fedora/rhel region )
Klaus
sjjh
February 13, 2016, 8:09pm
13
Yes, but it seems there is still more active:
simon@laptop:~$ ps aux | grep acpid
root 565 0.0 0.0 0 0 ? S< 15:31 0:00 [ktpacpid]
root 2082 0.0 0.0 4396 1736 ? Ss 15:31 0:00 /usr/sbin/acpid
simon 28534 0.0 0.0 15220 2256 pts/12 S+ 21:04 0:00 grep acpid
simon@laptop:~$ sudo service acpid stop
[sudo] Passwort fĂĽr simon:
simon@laptop:~$ sudo rmmod thinkpad_acpi
rmmod: ERROR: Module thinkpad_acpi is in use
simon@laptop:~$ ps aux | grep acpid
simon 334 0.0 0.0 15216 2124 pts/12 S+ 21:07 0:00 grep acpid
root 565 0.0 0.0 0 0 ? S< 15:31 0:00 [ktpacpid]
lklaus
February 13, 2016, 8:14pm
14
Hmm… Might the kernel process ktpacpid also use the module? Honestly, no idea. Probably, no help short of rebooting…
sjjh
February 13, 2016, 8:21pm
15
Not sure if I get you right. Purely rebooting probably won’t help as acpid will get started again… First I would need to make sure that the module is loaded with the correct parameter. But I’m not sure which program/process starts it and where and how to pass the parameter “fan_control=1” to it. I would guess placing a file in /etc/modprobe.d/
might be the right way to do it, but I don’t understand the syntax the existing files use.
lklaus
February 13, 2016, 8:29pm
16
Ok, I see. Yes, this would be an entry in a file in /etc/modprobe.d
Best thing to first check whether there is already an entry present
If not, add a file acpid.conf, with one line
THis should do it.
Edit: add a depmod -a for good measure…
sjjh
February 13, 2016, 8:32pm
17
No such file was present.
Is it one line?
simon@laptop:~$ cat /etc/modprobe.d/acpid.conf
options acpid fan_control=1 depmod -a
simon@laptop:~$
lklaus
February 13, 2016, 8:38pm
18
Ah, sorry. The “depmod -a” does not belong to this file, but after creating the file issue the command
on the command line, as root.
That is, the one line content would be fine without depmod -a
sjjh
February 13, 2016, 8:54pm
19
did a reboot and it seems that I dismiproved it – or that we now know that fan control is not supported…
simon@laptop:~$ cat /proc/modules | grep thinkpad
thinkpad_acpi 86016 1 - Live 0x0000000000000000
nvram 16384 1 thinkpad_acpi, Live 0x0000000000000000
snd 81920 28 snd_usb_audio,snd_usbmidi_lib,snd_hda_codec_hdmi,snd_hda_codec_conexant,snd_hda_codec_generic,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_pcm,snd_rawmidi,snd_seq,snd_seq_device,snd_timer,thinkpad_acpi, Live 0x0000000000000000
video 36864 2 thinkpad_acpi,i915, Live 0x0000000000000000
root@laptop:/home/simon# ps aux | grep acpid
root 558 0.0 0.0 0 0 ? S< 21:43 0:00 [ktpacpid]
root 2003 0.0 0.0 4396 1784 ? Ss 21:43 0:00 /usr/sbin/acpid
root 20636 0.0 0.0 15220 2248 pts/2 S+ 21:52 0:00 grep --color=auto acpid
root@laptop:/home/simon# echo "level 7" > /proc/acpi/ibm/fan
bash: echo: Schreibfehler: Das Argument ist ungĂĽltig.
lklaus
February 13, 2016, 8:57pm
20
you can check whether your modifications did work:
there should be (probably among others) an entry fan_control. If you “cat” it, it should show a one.