Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FS#177 - ipq806x: USB3 ports fail if booted when USB3 flash drives inserted during boot time #6605

Closed
openwrt-bot opened this issue Sep 14, 2016 · 17 comments
Labels

Comments

@openwrt-bot
Copy link

jmomo:

I am seeing a strange problem on my new TRENDnet tew827dru, which is ipq8064 based. It has two USB3 ports.

I have noticed a problem with USB3 flash drives. If I boot the system with a USB3 flash drive in one of the USB ports, that port won't allow any USB3 drive to work in it until the next reboot.

It seems to be a xhci driver issue.

This behavior is somewhat intermittent and varied. Occasionally my Silicon Power USB3 drive will work fine after boot, and the SanDisk USB3 drive almost always appears in "lsusb", but the partitions can't be seen, and it still doesn't work right.

The USB3 flash drives works as-expected if I plug them in after boot is completed. The issue only appears if the drive was already in the USB port during boot time.

USB2 flash drives do not appear to be affected.

The USB3 hard drive I tested seems to be working normally.

USB flash drives tested:
SanDisk Ultra Fit 16GB (USB3)
Patriot Tab Series 32GB (USB3)
Silicon Power 32GB (USB3)
Kingston DTSE9 8GB (USB2)
Patriot 8GB (USB2)

Below is an example boot log where I have a USB3 drive in each of the two ports. Neither works:

[ 6.275249] usbcore: registered new interface driver usbfs
[ 6.275322] usbcore: registered new interface driver hub
[ 6.279713] usbcore: registered new device driver usb
[ 6.320705] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[ 6.321204] ehci-platform: EHCI generic platform driver
[ 6.331613] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[ 6.332310] ohci-platform: OHCI generic platform driver
[ 6.841285] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller
[ 6.841329] xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 1
[ 6.845931] xhci-hcd xhci-hcd.0.auto: hcc params 0x0228f065 hci version 0x100 quirks 0x00010010
[ 6.853327] xhci-hcd xhci-hcd.0.auto: irq 200, io mem 0x11000000
[ 6.862602] hub 1-0:1.0: USB hub found
[ 6.872028] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller
[ 6.875912] xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 2
[ 6.881170] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM.
[ 6.889250] hub 2-0:1.0: USB hub found
[ 6.900865] xhci-hcd xhci-hcd.1.auto: xHCI Host Controller
[ 6.904684] xhci-hcd xhci-hcd.1.auto: new USB bus registered, assigned bus number 3
[ 6.910066] xhci-hcd xhci-hcd.1.auto: hcc params 0x0228f065 hci version 0x100 quirks 0x00010010
[ 6.917541] xhci-hcd xhci-hcd.1.auto: irq 201, io mem 0x10000000
[ 6.926734] hub 3-0:1.0: USB hub found
[ 6.936227] xhci-hcd xhci-hcd.1.auto: xHCI Host Controller
[ 6.940023] xhci-hcd xhci-hcd.1.auto: new USB bus registered, assigned bus number 4
[ 6.945455] usb usb4: We don't know the algorithms for LPM for this host, disabling LPM.
[ 6.953458] hub 4-0:1.0: USB hub found
[ 6.965885] of_get_named_gpiod_flags: parsed 'gpios' property of node '/gpio-leds/usb[0]' - status (0)
[ 6.967608] usbcore: registered new interface driver usb-storage
[ 7.182177] usb 1-1: new high-speed USB device number 2 using xhci-hcd
[ 7.252035] usb 3-1: new high-speed USB device number 2 using xhci-hcd
[ 7.382317] usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd
[ 7.417343] usb-storage 2-1:1.0: USB Mass Storage device detected
[ 7.428209] scsi host0: usb-storage 2-1:1.0
[ 7.442238] usb 4-1: new SuperSpeed USB device number 2 using xhci-hcd
[ 7.465865] usb-storage 4-1:1.0: USB Mass Storage device detected
[ 7.466672] scsi host1: usb-storage 4-1:1.0
[ 8.462818] scsi 1:0:0:0: Direct-Access USB DISK 3.0 PMAP PQ: 0 ANSI: 6
[ 17.195224] xhci-hcd xhci-hcd.0.auto: xHCI host not responding to stop endpoint command.
[ 17.195259] xhci-hcd xhci-hcd.0.auto: Assuming host is dying, halting host.
[ 17.227985] xhci-hcd xhci-hcd.0.auto: Host not halted after 16000 microseconds.
[ 17.228010] xhci-hcd xhci-hcd.0.auto: Non-responsive xHCI host is not halting.
[ 17.234094] xhci-hcd xhci-hcd.0.auto: Completing active URBs anyway.
[ 17.241418] xhci-hcd xhci-hcd.0.auto: HC died; cleaning up
[ 17.248041] usb usb1-port1: couldn't allocate usb_device
[ 17.253296] usb 2-1: USB disconnect, device number 2
[ 17.263718] xhci-hcd xhci-hcd.1.auto: xHCI host not responding to stop endpoint command.
[ 17.263722] xhci-hcd xhci-hcd.1.auto: Assuming host is dying, halting host.
[ 17.289471] xhci-hcd xhci-hcd.1.auto: Host not halted after 16000 microseconds.
[ 17.289481] xhci-hcd xhci-hcd.1.auto: Non-responsive xHCI host is not halting.
[ 17.289491] xhci-hcd xhci-hcd.1.auto: Completing active URBs anyway.
[ 17.289534] xhci-hcd xhci-hcd.1.auto: HC died; cleaning up
[ 17.324608] usb usb3-port1: couldn't allocate usb_device
[ 17.329821] usb 4-1: USB disconnect, device number 2

And here I am plugging both drives in after the system booted normally.

[ 6.193094] usbcore: registered new interface driver usbfs
[ 6.193173] usbcore: registered new interface driver hub
[ 6.197552] usbcore: registered new device driver usb
[ 6.238600] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[ 6.239108] ehci-platform: EHCI generic platform driver
[ 6.249525] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[ 6.250089] ohci-platform: OHCI generic platform driver
[ 6.761332] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller
[ 6.761378] xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 1
[ 6.765992] xhci-hcd xhci-hcd.0.auto: hcc params 0x0228f065 hci version 0x100 quirks 0x00010010
[ 6.773385] xhci-hcd xhci-hcd.0.auto: irq 200, io mem 0x11000000
[ 6.782646] hub 1-0:1.0: USB hub found
[ 6.792070] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller
[ 6.795937] xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 2
[ 6.801200] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM.
[ 6.809293] hub 2-0:1.0: USB hub found
[ 6.820906] xhci-hcd xhci-hcd.1.auto: xHCI Host Controller
[ 6.824739] xhci-hcd xhci-hcd.1.auto: new USB bus registered, assigned bus number 3
[ 6.830118] xhci-hcd xhci-hcd.1.auto: hcc params 0x0228f065 hci version 0x100 quirks 0x00010010
[ 6.837600] xhci-hcd xhci-hcd.1.auto: irq 201, io mem 0x10000000
[ 6.846784] hub 3-0:1.0: USB hub found
[ 6.856269] xhci-hcd xhci-hcd.1.auto: xHCI Host Controller
[ 6.860070] xhci-hcd xhci-hcd.1.auto: new USB bus registered, assigned bus number 4
[ 6.865495] usb usb4: We don't know the algorithms for LPM for this host, disabling LPM.
[ 6.873508] hub 4-0:1.0: USB hub found
[ 6.885940] of_get_named_gpiod_flags: parsed 'gpios' property of node '/gpio-leds/usb[0]' - status (0)
[ 27.122596] usb 4-1: new SuperSpeed USB device number 2 using xhci-hcd
[ 27.144608] usb-storage 4-1:1.0: USB Mass Storage device detected
[ 27.155356] scsi host0: usb-storage 4-1:1.0
[ 28.153237] scsi 0:0:0:0: Direct-Access USB DISK 3.0 PMAP PQ: 0 ANSI: 6
[ 30.753700] usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd
[ 30.776011] usb-storage 2-1:1.0: USB Mass Storage device detected
[ 30.776282] scsi host1: usb-storage 2-1:1.0

@openwrt-bot
Copy link
Author

john:

are there power gpios for usb on the unit ? if they exist we could force them low during boot/probe

@openwrt-bot
Copy link
Author

jmomo:

I regret there are not. I scanned all the GPIOs and OEM source shows no indication of anything connected to the USB ports. Definitely a bummer.

There are some device tree changes I will tinker with this weekend, but I am a clueless noob in this particular area.

@openwrt-bot
Copy link
Author

mkresin:

Would you please test again. The USB driver was updated in LEDE with https://git.lede-project.org/70434c3f948749690de24effbfd98f7a64e0991f

@openwrt-bot
Copy link
Author

nwperfguy:

Hi

Same problem on TP-Link Archer C2600. Extroot works with USB 2.0 only.
USB 3.0 pendrives have the same issue as above. LEDE version id: r2437+12-7c47f43

[ 15.804603] xhci-hcd xhci-hcd.0.auto: xHCI host not responding to stop endpoint command. [ 15.804654] xhci-hcd xhci-hcd.0.auto: Assuming host is dying, halting host. [ 15.840270] xhci-hcd xhci-hcd.0.auto: Host not halted after 16000 microseconds. [ 15.840314] xhci-hcd xhci-hcd.0.auto: Non-responsive xHCI host is not halting. [ 15.846396] xhci-hcd xhci-hcd.0.auto: Completing active URBs anyway. [ 15.853751] xhci-hcd xhci-hcd.0.auto: HC died; cleaning up

Regards,

Zoltan

@openwrt-bot
Copy link
Author

dissent1:

Could you try to apply this patch and check how it goes?

@openwrt-bot
Copy link
Author

dissent1:

V2 patch here, please test who faces this problem

@openwrt-bot
Copy link
Author

dissent1:

This patch has been upstream for a while, so snapshots and 17.01 rc1 already contain it.

Please report if this issue is fixed for you who experience it.

@openwrt-bot
Copy link
Author

reiffert:

Please re-open FS#177. I'm running into the same issue on ipq806x R7500. Relevant dmesg:

[ 2.947426] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 2.947943] ehci-platform: EHCI generic platform driver [ 2.957125] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver [ 2.959543] ohci-platform: OHCI generic platform driver [ 3.069589] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller [ 3.069659] xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 1 [ 3.074210] xhci-hcd xhci-hcd.0.auto: hcc params 0x0228f065 hci version 0x100 quirks 0x00010010 [ 3.081705] xhci-hcd xhci-hcd.0.auto: irq 168, io mem 0x11000000 [ 3.091538] hub 1-0:1.0: USB hub found [ 3.096524] hub 1-0:1.0: 1 port detected [ 3.100606] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller [ 3.104107] xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 2 [ 3.109751] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM. [ 3.118021] hub 2-0:1.0: USB hub found [ 3.125427] hub 2-0:1.0: 1 port detected [ 3.129530] xhci-hcd xhci-hcd.1.auto: xHCI Host Controller [ 3.132942] xhci-hcd xhci-hcd.1.auto: new USB bus registered, assigned bus number 3 [ 3.138560] xhci-hcd xhci-hcd.1.auto: hcc params 0x0228f065 hci version 0x100 quirks 0x00010010 [ 3.145828] xhci-hcd xhci-hcd.1.auto: irq 169, io mem 0x10000000 [ 3.155248] hub 3-0:1.0: USB hub found [ 3.160769] hub 3-0:1.0: 1 port detected [ 3.164430] xhci-hcd xhci-hcd.1.auto: xHCI Host Controller [ 3.168402] xhci-hcd xhci-hcd.1.auto: new USB bus registered, assigned bus number 4 [ 3.173697] usb usb4: We don't know the algorithms for LPM for this host, disabling LPM. [ 3.181692] hub 4-0:1.0: USB hub found [ 3.189606] hub 4-0:1.0: 1 port detected [ 3.195128] kmodloader: done loading kernel modules from /etc/modules-boot.d/* [ 3.206452] init: - preinit - [ 3.418439] usb 1-1: new high-speed USB device number 2 using xhci-hcd [ 3.689805] usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd

[ 13.438408] xhci-hcd xhci-hcd.0.auto: xHCI host not responding to stop endpoint command.
[ 13.438464] xhci-hcd xhci-hcd.0.auto: Assuming host is dying, halting host.
[ 13.470724] xhci-hcd xhci-hcd.0.auto: Host not halted after 16000 microseconds.
[ 13.470765] xhci-hcd xhci-hcd.0.auto: Non-responsive xHCI host is not halting.
[ 13.476846] xhci-hcd xhci-hcd.0.auto: Completing active URBs anyway.
[ 13.484259] xhci-hcd xhci-hcd.0.auto: HC died; cleaning up
[ 13.491351] usb usb1-port1: couldn't allocate usb_device
[ 13.496032] usb 2-1: USB disconnect, device number 2

@openwrt-bot
Copy link
Author

reiffert:

I have the same issue on ipq806x R7500. Relevant dmesg:

[ 3.069589] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller [ 3.069659] xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 1 [ 3.074210] xhci-hcd xhci-hcd.0.auto: hcc params 0x0228f065 hci version 0x100 quirks 0x00010010 [ 3.081705] xhci-hcd xhci-hcd.0.auto: irq 168, io mem 0x11000000 [ 3.091538] hub 1-0:1.0: USB hub found [ 3.096524] hub 1-0:1.0: 1 port detected [ 3.100606] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller [ 3.104107] xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 2 [ 3.109751] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM. [ 3.118021] hub 2-0:1.0: USB hub found [ 3.125427] hub 2-0:1.0: 1 port detected [ 3.129530] xhci-hcd xhci-hcd.1.auto: xHCI Host Controller [ 3.132942] xhci-hcd xhci-hcd.1.auto: new USB bus registered, assigned bus number 3 [ 3.138560] xhci-hcd xhci-hcd.1.auto: hcc params 0x0228f065 hci version 0x100 quirks 0x00010010 [ 3.145828] xhci-hcd xhci-hcd.1.auto: irq 169, io mem 0x10000000 [ 3.155248] hub 3-0:1.0: USB hub found [ 3.160769] hub 3-0:1.0: 1 port detected [ 3.164430] xhci-hcd xhci-hcd.1.auto: xHCI Host Controller [ 3.168402] xhci-hcd xhci-hcd.1.auto: new USB bus registered, assigned bus number 4 [ 3.173697] usb usb4: We don't know the algorithms for LPM for this host, disabling LPM. [ 3.181692] hub 4-0:1.0: USB hub found [ 3.189606] hub 4-0:1.0: 1 port detected [ 3.195128] kmodloader: done loading kernel modules from /etc/modules-boot.d/* [ 3.206452] init: - preinit - [ 3.418439] usb 1-1: new high-speed USB device number 2 using xhci-hcd <---- WHY THAT?? [ 3.689805] usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd

[ 13.438408] xhci-hcd xhci-hcd.0.auto: xHCI host not responding to stop endpoint command.
[ 13.438464] xhci-hcd xhci-hcd.0.auto: Assuming host is dying, halting host.
[ 13.470724] xhci-hcd xhci-hcd.0.auto: Host not halted after 16000 microseconds.
[ 13.470765] xhci-hcd xhci-hcd.0.auto: Non-responsive xHCI host is not halting.
[ 13.476846] xhci-hcd xhci-hcd.0.auto: Completing active URBs anyway.
[ 13.484259] xhci-hcd xhci-hcd.0.auto: HC died; cleaning up
[ 13.491351] usb usb1-port1: couldn't allocate usb_device
[ 13.496032] usb 2-1: USB disconnect, device number 2

@openwrt-bot
Copy link
Author

reiffert:

Works for me when adjusting drivers/usb/core/hub.c

#define HUB_ROOT_RESET_TIME from 50 to 150

According to the USB specs "50ms" is the minimum time reset should be fired.

Is anyone able to reproduce?

@openwrt-bot
Copy link
Author

dissent1:

Not every usb3.0 device connected cause this USB port failure.

Want to propose the patch with HUB_ROOT_RESET_TIME change as a workaround?

There's no other solution for now it seems.

@openwrt-bot
Copy link
Author

mkresin:

Want to propose the patch with HUB_ROOT_RESET_TIME change as a workaround?

As long as the datasheet doesn't say that this particular chip needs more than 50ms to reset, it is a workaround and ugly one on top.

It isn't different to adding sleeps to workaround race conditions. Might work at the moment but will break sooner or later. As long as it isn't a hardware limitation, I prefer to get the underlying issue fixed instead.

@openwrt-bot
Copy link
Author

reiffert:

For the records: More debug and more fixes.

withbug-inserting-after-booting.txt : http://cxg.de/_bfd3bb.htm
withbug-inserting-prior-to-host-poweron.txt : http://cxg.de/_41ae6d.htm

fix1-inserting-after-booting.txt : http://cxg.de/_d5d5b8.htm
fix1-inserting-prior-to-host-poweron.txt : http://cxg.de/_10b096.htm

fix2-inserting-after-booting.txt : http://cxg.de/_7da6e0.htm
fix2-inserting-prior-to-host-poweron.txt : http://cxg.de/_b6cfbd.htm

fix3-inserting-after-booting.txt : http://cxg.de/_209271.htm
fix3-inserting-prior-to-host-poweron.txt : http://cxg.de/_1092b8.htm

@openwrt-bot
Copy link
Author

reiffert:

Side note: chandoo on IRC has a R7500v2 with 2xUSB3.0 ports and doesn't have the regression with a 32GB SanDisk Ultra. See attachment.

@openwrt-bot
Copy link
Author

reiffert:

krm on IRC was able to confirm the bug on Archer C2600, 2x USB3.0 using Kingston Datatraveler 100 G3, 8GB. He was also testing the 3 fixes.

@openwrt-bot
Copy link
Author

reiffert:

There's a 4th "workaround":

On PHY level at the end of the hs and ss init() functions the registers look like:

[ 18.582092] HS init regs: 00 0x10010002 [ 18.583156] HS init regs: 04 0x00000190 [ 18.586713] HS init regs: 08 0x00000004 [ 18.590623] HS init regs: 0c 0x00000000 [ 18.594352] HS init regs: 10 0x01360bf2 [ 18.598171] HS init regs: 14 0x00de06e4 [ 18.602058] HS init regs: 18 0x00000000 <- (1) [ 18.605811] HS init regs: 1c 0x00000100 <- (2) [ 18.609704] HS init regs: 20 0x00000000 [ 18.613449] HS init regs: 24 0x00000000 [ 18.617269] HS init regs: 28 0x00000000 [ 18.621168] HS init regs: 2c 0x00000000 [ 18.672157] SS init regs: 00 0x01000119 [ 18.672175] SS init regs: 04 0x07f8164d [ 18.674784] SS init regs: 08 0x00000000 [ 18.678688] SS init regs: 0c 0x00004b7f [ 18.682424] SS init regs: 10 0x00000000 [ 18.686243] SS init regs: 14 0x00000000 [ 18.690155] SS init regs: 18 0x00000000 [ 18.693881] SS init regs: 1c 0x00000000 [ 18.697701] SS init regs: 20 0x00000000 [ 18.701581] SS init regs: 24 0x00000000 [ 18.705341] SS init regs: 28 0x00000000 [ 18.709234] SS init regs: 2c 0x00000000

(2):

  • The value is 0x0 when no drive is connected. Device speed is reported as superspeed to xhci once plugged in.
  • The value is 0x100 when a thumbdrive is connected during host poweron. Device speed is reported as highspeed to xhci
  • The value is 0x0 again after re-loading the dwc3 module or after host restart. Device speed is reported as superspeed to xhci

Some older kernels refer to this register as CHARGING_DET_OUTPUT_REG

(1): Once there is a 0x100 in CHARGING_DET_OUTPUT_REG you could write 0x30, 0x32, 0x34 or 0x3f into the 0x18 register and the xhci would detect the device as superspeed.
Some older kernels refer to this register as CHARGING_DET_CTRL_REG.
Some older kernels refer to 0x3F as reset value for the CHARGING_DET_CTRL_REG.

It's unclear to me how to disable battery charging from the PHY init level. Once battery charging is detected due to the thumbdrives having no external power source, the device speed is reported as high-speed.
It's also unclear why the CHARGING_DET_OUTPUT_REG is empty after module re-loading or host restart.

Roadmap: identify why the CHARGING_DET_OUTPUT_REG is empty after module re-loading. Supposedly that's the case after XHCI issues a port reset during device detection(?). Then issue the same port reset during PHY initialisation.

@openwrt-bot
Copy link
Author

dissent1:

There seem to be a fix for this in QSDK that is part of this commit:
https://source.codeaurora.org/quic/qsdk/oss/kernel/linux-msm/commit/?h=eggplant&id=d316437c9cdb70023a760342678f32e27241725a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant