Dell Keyboard Hi-Z Failures
Introduction
I had a Dell Precision 7750 recently show up on my bench where a number of keys would intermittently work. The main culprits were 2, Caps Lock, and F1, with W also requiring a greater than nominal amount of pressure to strike, though it was more consistent.
I came across a few forum posts for other Dell machines where some people reported that they needed to press W, S, X simultaneously when the fault occurred to restore function to the affected keys temporarily.
Generally the accepted fix was to simply replace the keyboard flex, but this sounds like it’s just resetting a timer. It would be beneficial and interesting to do a root-cause analysis on the issue to understand why it’s happening.
Connector Pinout
Below is a section of the trackpad PCBA with the keyboard ZIF pinout labelled.
And of course, the FFC cable from the keyboard is the same but mirrored.
Almost all of the pins go to the Microchip ECE1117 keyboard controller. There’s one pin that goes to a via near R21 that I didn’t bother tracing, and four pins at the other end that go to the 820R resistors. These resistors all go to FETs, which in-turn connect to PWM2/3/4 on the ECE1117. Thus, they are likely driving the Caps Lock, Num Lock, and Mute button LEDs.
Update: After inspection, The four pins on the left side of the FPC connector are indeed for the LEDs. They are identified by the serpentine (meandering) traces on the flex, which make distributed-element resistors for LED current limiting. It’s crude, but it saves having to fit resistors on keyboard membrane which can’t be assembled using traditional soldered methods due to its screen-printed silver ink and acetate construction. The LEDs are connected in common-cathode configuration, which indicates they are high-side drive.
Additionally, the pin marked Via is shorted to the pin marked R11 on the keyboard membrane at the Num Lock cathode. Given R11 is the common cathode for all the LEDs, this means that this pin is likely grounded on the trackpad PCB. In turn, given the Via pin is shorted to R11, this is likely a signal for detecting whether the membrane is connected to the ZIF connector. The Via signal is grounded when the FPC connector is installed correctly.
Key Matrix Basics
The theory of operation for these matrix-style keyboards are well documented elsewhere so I won’t go into the details here.
Suffice it to say, they’re a row/column arrangement with pulled-up/down inputs connected to either the rows or columns (generally the rows). For the ECE1117, these pull resistors are nominally 73kOhms. Conversely a driven voltage (opposite to the pull direction of the inputs) are applied at the complementary columns, such so that when a key is pressed, it connects this driven voltage to the corresponding input.
Given the nature of this circuit, it’s clear that ideally the switches would be perfect conductors so it does not form any potential divider with the pull resistors at the inputs. In reality however, all conductors have impedance and will contribute to attenuation of signals.
A common rule of thumb for cases like this is that they should have an impedance/resistance <1% of the pull resistor. In our case, this would set a pass threshold of ~700 Ohms; that is, any switch that has a closed impedance less than this, is no cause for suspicion. The absolute limit would be determined by the minimum tolerance of the pull resistor (let’s call this 42kOhms for the ECE1117) and the <0.3*VDD and >0.7*VDD rule of thumb that’s typical for VIL and VIH thresholds, respectively. In our case, this would be 18kOhms, as anything higher will exceed the 0.3 or 0.7 reference ratio threshold.
To be clear however even if a switch came in marginally under this 18kOhm hard limit, it should still be cause for suspicion. After all, intermittency and marginal passes make bedfellows. What we’re really looking for are contextual outliers. If you’re measuring a membrane and all the switches measure <500 Ohms and you find one that measures 5kOhms, there’s likely a problem with the latter. In short, exercise your judgement.
Measurements
I purchased a replacement keyboard and set out to find the pins corresponding to the problematic keys. Here’s what I managed to find:
As we can see, while the keys all have different sense lines (KSI), they all share a common drive line (KSO01). This gives merit to why pressing W, S, X simultaneously as suggested on the forums temporarily resolves the fault. On the working keyboard, the pressed states on any of these keys represent a resistance <200 Ohms; though we can see they’re not all the same. F1 and Caps Lock have among the highest resistances and indeed these were the most temperamental keys, followed by 2. Interestingly, S and X on the faulty keyboard didn’t appear to indicate any signs of intermittency. X does have the lowest loop resistance at 100 Ohms so perhaps it’s just slipping under thanks to KSI6 despite a failing KSO01 line. S may potentially have a lower resistance than what I have captured in the table. It’s difficult to get a consistent press with a pinky finger while also probing the fragile carbon printed terminals. Regardless, we now have a working reference.
The table below shows the loop resistances measured for the same keys on the faulty keyboard when pressed with normal typing pressure.
The resistances correlate perfectly with the reliability of the keys. Even applying as much pressure as one could with an extended pinky while having probes in both hands, I could not get anything other than open loop for F1, 2, and Caps Lock.
Inspection
Once I had replaced the keyboard, I dismantled the faulty one and took a high resolution scan of it (well as, much of it as my flatbed would allow). I had to heavily compress it to fit within substack’s size limits, so apologies for the jpeg. I also mirrored the photo so that it matches the keyboard layout as viewed from the front. Ordinarily these traces are actually on the back - facing the metal frame.
Here’s what we get when we trace out the KSO01:
The first thing we notice is that W and S are primary branches off the trace from the FFC connector. S goes on to feed X in series, while W feeds 2, F1, and Caps Lock. There’s also some brown staining around the space bar area. I thought this might be spill damage, but there’s no matching stain pattern on the backlight diffuser. For now, we’ll assume it’s negligible.
Zooming into the W key, we see a circular artefact that isn’t present on the S or X keys.
This could explain why S and X continued to work reliably, while everything connected to the W branch was completely non-functional, with the W-key itself exhibiting intermittency.
If we flip the membrane over to the key side and peel off the rubber cap, we can immediately see the issue:
It’s difficult to capture it on camera, but what you’re looking at are two lollipop-shaped pads separated by a very small air gap. One of these pads connects to the KSO01 net (the one in the background in this image with two tracks connecting to it), while the other (in the case of W) connects to KSI2. When the rubber cap is compressed by the key cap, the two lollipop pads are pushed together, which closes the switch. This is the general principle of how these multi-layered membrane keyboards work.
The problem with the above image though is the sizeable hole in the middle of the top pad. Again, it’s hard to see, but the darker silver blot you’re seeing is actually the bottom pad through a void in the top pad. Whatever has happened to this key has resulted in the top pad seemingly being fused to the bottom at a point when the key was pressed, and ripped apart when released.
We can also see evidence of fracturing on the two traces leaving the bottom pad. Look at the circular outline in the plastic membrane and notice the metal ‘filings’ right along the perimeter (this is likely conductive silver ink though so more aptly peeling). This could explain why the F1, 2, and Caps Lock keys completely failed, while W worked intermittently. The fracture in the trace knocked out the downstream keys(F1, 2, Caps Lock), while W continued to degrade over time; its resistance increasing as the hole got larger. I confirmed this by very carefully cutting the top layer of the membrane off with a scalpel and checking continuity. If I placed the two probes anywhere in the area marked by the red arrows, I would register continuity. However if I moved one of the probes to the corresponding opposite side of the circular perimeter where the blue arrows are, continuity would be lost. This confirmed that there was indeed very fine fracturing in the silver ink layer around the perimeter of the circular cut out.
But why would this happen, and why only on this key seemingly so ubiquitously? I thought perhaps this was a thermal issue from the W key being situated over a hot component on the Precision. However the same issue occurs with the same keys on Inspirons as well. What are the odds there are hot components in exactly the same position resulting in the same type of failure on a machine with a vastly lower TDP than the Precision? Unlikely. It also seems there are complaints for other brands entirely (HP, Lenovo, etc.) about this exact column (F1, 2, W, S, X) showing failure.
Therefore it seems there may be an intrinsic design flaw w.r.t. how the matrix is designed.
Membrane Reverse-Engineering
In the hopes of finding out if there was some design-level explanation for the failure, I took the time to trace out all the KSI/KSO key assignments. Here they are; I suffered so you don’t have to:
I’ve highlighted the keys sharing the KSO01 column based on complete failure (red), intermittent function (yellow), and normal function (green). There doesn’t seem to be anything that particularly stands out.
If one were to start from the hole in the W key as an indicator of culprit, and from there try to seek blame in either the column or row sizes, that W occupies; KSO01 is amongst the shortest of all the KSOs containing alphabetical keys. The only KSOs that are shorter are mainly ones that contain modifiers.
In a similar vein, KSI2 is the second shortest of all the KSIs with only 11 keys, compared to 14 on KSI1/3/5. Curiously however, KSI1/3/5 all contain keys that have suffered complete failure (Caps Lock, F1, 2). So perhaps there’s something to look into here.
Let’s set up a quick simulation of a subset of this circuit. This site uses tinyURL for shortening, so I’m not sure how long the link will stay up. If it eventually dies, go to the https://www.falstad.com/circuit/circuitjs.html and import the netlist shown below.
$ 1 0.000005 5.459815003314424 50 5 50 5e-11
155 224 256 288 256 0 0
155 528 256 560 256 0 0
155 672 256 704 256 0 0
w 640 256 672 256 0
w 768 320 816 320 0
w 816 320 816 208 0
w 816 208 208 208 0
w 528 288 512 288 0
w 512 288 512 368 0
w 672 288 656 288 0
w 656 288 656 368 0
w 656 368 512 368 0
w 512 368 352 368 0
w 208 368 208 288 0
w 224 288 208 288 0
R 144 368 96 368 0 2 400 2.5 2.5 0 0.5
207 464 496 512 496 4 KSO01
207 624 496 672 496 4 KSO02
207 768 496 816 496 4 KSO03
w 224 256 208 256 0
w 208 368 144 368 0
w 528 256 480 256 0
w 208 256 208 208 0
155 368 256 400 256 0 0
w 352 368 208 368 0
w 352 288 352 368 0
w 368 288 352 288 0
w 336 256 368 256 0
w 336 256 336 320 0
w 336 320 320 320 0
w 464 320 464 416 0
w 624 320 624 416 0
w 768 320 768 416 0
207 896 656 896 608 4 KSO01
207 1024 656 1024 608 4 KSO02
207 768 656 768 608 4 KSO03
s 768 896 832 896 0 1 false
x 796 867 814 870 4 24 Q
x 796 947 812 950 4 24 A
w 768 896 768 976 0
s 768 976 832 976 0 1 false
s 768 1056 832 1056 0 1 false
w 768 976 768 1056 0
x 796 1027 810 1030 4 24 Z
207 1280 912 1328 912 4 KSI2
w 832 896 848 896 0
w 848 896 848 912 0
w 848 912 976 912 0
w 976 896 976 912 0
w 960 896 976 896 0
x 924 1027 940 1030 4 24 X
w 896 976 896 1056 0
s 896 1056 960 1056 0 1 false
s 896 976 960 976 0 1 false
w 896 896 896 976 0
x 924 947 940 950 4 24 S
x 924 867 946 870 4 24 W
s 896 896 960 896 0 1 false
w 976 976 976 992 0
w 848 992 976 992 0
w 848 976 848 992 0
w 832 976 848 976 0
w 976 976 960 976 0
w 976 1056 976 1072 0
w 848 1072 976 1072 0
w 848 1056 848 1072 0
w 832 1056 848 1056 0
w 976 1056 960 1056 0
w 1104 1056 1088 1056 0
w 976 1072 1104 1072 0
w 1104 1056 1104 1072 0
w 1104 976 1088 976 0
w 976 992 1104 992 0
w 1104 976 1104 992 0
s 1024 896 1088 896 0 1 false
w 1024 816 1024 896 0
x 1052 867 1068 870 4 24 E
x 1052 947 1069 950 4 24 D
w 1024 896 1024 976 0
s 1024 976 1088 976 0 1 false
s 1024 1056 1088 1056 0 1 false
w 1024 976 1024 1056 0
x 1052 1027 1069 1030 4 24 C
w 1088 896 1104 896 0
w 1104 896 1104 912 0
w 976 912 1104 912 0
207 1280 992 1328 992 4 KSI4
207 1280 1072 1328 1072 4 KSI6
r 1168 736 1168 800 0 10000
r 1216 736 1216 800 0 10000
r 1264 736 1264 800 0 10000
w 1120 832 1120 800 0
w 1280 1072 1264 1072 0
w 1264 1072 1104 1072 0
w 1168 912 1168 800 0
w 1216 992 1280 992 0
w 1216 992 1104 992 0
w 1216 992 1216 800 0
w 1168 912 1280 912 0
w 1168 912 1104 912 0
r 464 416 464 496 0 10
r 624 416 624 496 0 10
r 768 416 768 496 0 10
w 784 256 768 256 0
w 640 256 624 256 0
w 480 256 464 256 0
d 1024 736 1024 656 2 default
w 1104 816 1104 832 0
w 1088 816 1104 816 0
x 1052 787 1065 790 4 24 3
s 1024 816 1088 816 0 1 false
s 896 816 960 816 0 1 false
x 924 787 937 790 4 24 2
w 960 816 976 816 0
w 976 816 976 832 0
w 848 816 848 832 0
w 832 816 848 816 0
x 796 787 809 790 4 24 1
s 768 816 832 816 0 1 false
w 896 816 896 896 0
w 768 816 768 896 0
w 1024 736 1024 816 0
w 896 736 896 816 0
w 768 736 768 816 0
207 1280 832 1328 832 4 KSI5
w 848 832 976 832 0
w 976 832 1104 832 0
w 1120 832 1280 832 0
w 1120 832 1104 832 0
r 1120 736 1120 800 0 10000
w 1264 1072 1264 800 0
w 1120 736 1120 720 0
w 1120 720 1168 720 0
w 1168 720 1168 736 0
w 1216 720 1216 736 0
w 1168 720 1216 720 0
w 1264 720 1264 736 0
w 1216 720 1264 720 0
R 1264 720 1264 688 0 0 40 5 0 0 0.5
s 992 736 992 656 4 0 false 1
w 1024 656 992 656 0
w 1024 736 992 736 0
w 896 736 864 736 0
w 896 656 864 656 0
s 864 736 864 656 4 0 false 1
d 896 736 896 656 2 default
w 768 736 736 736 0
w 768 656 736 656 0
s 736 736 736 656 4 0 false 1
d 768 736 768 656 2 default
o 33 64 0 4103 10 0.4 0 2 33 3 KSO01
o 34 64 0 4103 5 0.4 0 2 34 3 KSO02
o 35 64 0 4103 5 0.003125 0 2 35 3 KSO03
o 44 64 0 4103 5 0.00009765625 0 2 44 3 KSI2
o 86 64 0 4103 5 0.00009765625 0 2 86 3 KSI4
o 124 64 0 4103 5 0.00009765625 0 2 124 3 KSI5
o 87 64 0 4103 5 0.00009765625 0 2 87 3 KDI6
If successful, you should see the below circuit:
I’ve set up here a cut down version of the main 4 rows of the, Q, W, and E columns.
Based on Table 9-3 in the ECE1117 datasheet, we can see that the KSO drive polarity is active low. That is, the selected column output drives the pulled-up KSI to ground through the pressed switch. Accordingly, in the simulation I have pulled up each KSI with 10K (arbitrary). The KSOs are driven by a flip-flop chain that simply ripples a pulse in an infinite loop to simulate a KSO strobe. I’ve fitted 10 Ohm resistors in series with every KSO to prevent zero-resistance loops that would otherwise break the simulation (infinite current). Why will become clear in a moment.
So let’s start playing with it. We can see that pressing the W key causes the pulled-up KSI2 line line to be grounded through the active-low pulse on KSO1 when it rolls around. By correlating the asserted KSO and KSI signals, the firmware can therefore determine which key is pressed.
We can also visually inspect that diodeless keyboards like this (we’ll talk about the ones I have bypassed at the top in a moment) are not protected against ghosting, so they use blocking instead.
Suppose we press W, E and 3. The KSI5 line’s pull-up is grounded by the KSO01 pulse via the path 3→E→W→KSO01. Thus, when the KSO01 pulse rolls around, we will see a ‘ghost’ of it on KSI5, which will lead to the controller incorrectly concluding that the 2 key was pressed.
To get around this, the firmware is simply designed to ignore or ‘block’ certain key combinations. You can try this out for yourself. Select a text box and hold down the W, E, and 3 keys. Then go ahead and try to press 2. You’ll find it doesn’t type. In fact you won’t be able to type 1, 4, 7, 8, 9, 0, Q, R, U, I, O, P either. But you will be able to type 5, 6, -, +/=, T, Y, [, and ].
Why becomes clear when we look at how these keys are arranged on the KSIs. You can see every key that’s ignored coexists on the same KSI line as W, E, or 3, while every key that’s correctly registered is on a different KSI line. This is how blocking works.
The reason for going into this detail about blocking is because multiple key presses relates on a common KSI relates to a detail in Table 9-3 of the ECE1117 datasheet; specifically that KSOs are driven high/low in both de-asserted and asserted cases. Let’s see what happens when we press W and E simultaneously:
We can see that we end up with a short from KSO01 to KSO02. This becomes problematic when one of them is asserted, as you get a high→low short whose current is only limited by the GPIO drive strength and the resistance of the keyboard traces themselves. Table 9-3 does use both assert/de-assert and driven high/low in different contexts. So it is possible in scan mode that the drive is open-drain only, in which case the above problem doesn’t exist. However we do have what appears to be evidence of an overcurrent event on the W key by way of a hole, so it can’t be ruled out.
We can see that if we open up the bypasses across the KSO diodes (simulating an open-drain drive), then the problem does indeed go away.
Conclusion
It’s not clear what the root cause was for the failure of the W key, though it is evident there was some form of overcurrent event. Where from remains unknown. If the cause was indeed due to totem-pole drive of the KSO lines being exposed to short circuit conditions via the keyboard matrix, then there are a number of options to improve reliability:
Configure the ECE1117 KSO lines in open-drain mode (may not be possible if the chip doesn’t allow for this).
Add series diodes to KSO lines so they behave as open-drain (adds cost).
Reduce drive strength on KSO lines to limit current during short circuit cases (may not be possible if the chip doesn’t allow for this).
Add series resistors to KSO lines to limit current during short circuit cases (cheaper than diode solution).
Increase trace thickness/conductivity of keyboard membrane traces. This will reduce the power dissipated in the membrane when exposed to short-circuit conditions and hopefully allow it to ‘ride through’. (may required a change in construction, which may increase cost).