Discussion:
[Nut-upsdev] blazer_usb: compatibility with Ippon BackPro UPSes (Phoenix Power Co., Ltd usb-to-serial controller)
Alexey Loukianov
2010-01-06 15:56:56 UTC
Permalink
Good day to all.

In one of the offices I'm administering there are about 15 UPS marked with the
brand "Ippon BackPro". This brand is widely popular in Russian Federation, while
by itself is a 'Made in PRC' megatec clone. Depending on the model this UPSes
have got either one standard serial RS-232C interface (cheaper models like Ippon
BackPro 400 and Ippon BackPro 600/700 that were manufactured back in about year
2005 or earlier) or dual-port control interface: one serial and one USB.

Ippon UPSes are by default shipped with Winpower java-based monitoring and
controlling solution which is said to be compatible with windows of any kind,
MacOS X and Linux. On workstations I used to use this software and it proved to
work quiet reliable under Windows XP while still require some additional
configuration to power the workstations down correctly before LowBattery
condition arises. It was always a big problem to run this software on server in
that office because of the server hardware that is available there. So-called
"server" is in reality an older workstation with the Celeron 2Ghz CPU and 128Mb
RAM acting as an office router and local mail server running CentOS4. So it was
almost impossible to run Winpower on this box due to Java memory requirements.

The only serial port available on this "server" box is occuped by local-specific
hardware (PBX logs control), so the only choice to connect UPSes to this server
was to use USB.

Back in 2007 I've been trying to configure NUT to use megatec_usb driver, but
that attempts were totally unsuccessful: driver was showing up "no response"
from UPS. Recently I've been able to do "upgrade" of a server adding additional
256Mb of RAM and it allowed me to install and run linux version of Winpower. It
was a big surprise for me that Winpower on linux successfully detects and
controls UPS connected through USB interface but there were some caveats
including huge CPU usage by Winpower main java process and tonns on "Process
java(1234) uses usb device without first claming it" showing up in dmesg. So I
have decided to give NUT one more chance.

Prebuilt binary RPM available for CentOS4 (nut-2.2.0-4.el4) ships with older
megatec_usb driver that still don't work with Ippon UPSes connected through USB.
So I have fetched latest revision of NUT from SVN and tried both megatec
protocol usb drivers from it: megatec_usb and blazer_usb.

megatec_usb driver in debug mode showed up that there's some king of connection
with UPS but correct status readings were obtained once in a 5-10 tries.
blazer_usb showed up no connections with UPS at all. As Winpower on the same
linux box was working perfectly my decision was to try to figure out what are
the differences in the way the USB UPS communication happens between the
Winpower, megatec_usb and blazer_usb.

My best friends in this process were DJ Java Decompiller 3.10 and IDA Pro 4.
Decompilation of Winpower java modules showed up that it uses native system
binary libraries to handle actual accesses to USB bus (System.loadLibrary),
which is called jusbMaxOSX/jusbMaxOSXi386 on MacOS X, jusb.dll on windows and
libusbLinux/libusbLinuxamd on i686/x86-64 linux.

Communication with UPS happens using megatec protocol through the single
function OrderUPS with C-style prototype:

int OrderUPS(char* cmd, char* return_buf, int return_buf_size, int timeout);

This prototype is perfectly self-describing, so I went on to IDA Pro and opened
up libusbLinux.so in it. Quick examination of this library showed up that this
is a PIC-enabled lib and is in fact a mix of libusb-0.1.x with some
Winpower-related funtions including OrderUPS. I'm not sure about the legal
status of this as AFAIR libusb-0.1.x is GNU GPL licensed so Winpower creators
should release full source of modified libusb they have used in their product.
Well, this might be just another GNU GPL violation example, but let me continue
to the interesting part. Disassembly of OrderUPS functions showed up that the
base algorithm of communications with USB UPS is as following:

1. Explore usbfs, find first device with ID 06da:0003 (Phoenixtec Power Co., Ltd).
2. Try to open this device using libusb.
3. In case of failure try to detach kernel driver from this device using libusb
and retry to open it.
4. If the device was opened up correctly set the configuration to 1 on it using:
usb_set_configuration(devfd, 0x1);
5. Send command to device using usb_control_msg function in 8-bytes chunks as
following:

- ------------------- cut --------------------------------------
int index = 0;
int count = strlen(cmd);
while(index < count) {
int tosend = ((count - index) > 8) ? 8 : (count - index);
char tmp[8];
memset(tmp, 0, sizeof(tmp));
memcpy(tmp, &cmd[index], tosend);
int ret = usb_control_message(devfd, 0x21, 0x9, 0x2, 0, tmp, tosend, 1000);
if((ret <= 0) || (ret != tosend)) {
return_buf[0] = 0;
return -1;
}
index += tosend;
}
- ------------------- cut --------------------------------------

6. Read back return string from UPS in one chunk using usb_interrupt_read:

- ------------------- cut --------------------------------------
char tmp[256];
memset(tmp, 0, sizeof(tmp));
int ret = usb_interrupt_read(devfd, 0x81, tmp, max_return_size, timeout);
if(ret <= 0) {
return_buf[0] = 0;
return -1;
}
int max = (max_return_size > 256) ? 256 : max_return_size;
memcpy(return_buf, tmp, max);
return_buf[max] = 0;
usb_close(devfd);
return max;
- ------------------- cut --------------------------------------

The differences between the way Winpower works with UPS and NUT works with UPS
are as follows:

1. megatec_usb and blazer_usb both read up data in 8-byte chunks. Winpower do it
in one large chunk.
2. blazer_usb tries to flush read buffer of controller before sending a command.
3. Both megatec_usb and blazer_usb use different USB message value in
usb_control_message call (Winpower uses 0x2, while blazer_usb uses 0x200 and
megatec_usb uses "ReportId+(0x03<<8)").

So I've changed blazer_usb.c to read up data in one large chunk instead of
8-byte chunks, to pass 0x2 instead of 0x200 in the usb_control_message call and
removed buffer-flushing code completely. At first glance blazer_usb still wasn't
working after this change but experiments showed up that this usb-to-serial
implementation hangs when max_return_size is set larger that 64 bytes. Limiting
return buffer size to 64 bytes in usb_interrupt_read call made blazer_usb to
work perfectly with Ippon UPS.

As for hangs: there are several ways to un-hang hanged UPS. You may replug USB
cable, you may reboot the server and you may use appropriate ioctl on usbfs node
to do a port-reset (libusb usb_reset also will do).

Side note: the device is not working with Winpower after it was hanged by
blazer_usb or megatec_usb.

In the end I've got working blazer_usb on CentOS4 with stock libusb (0.1.8). Key
point was to read up data in one 64 byte chunk. It is still unclear why does the
data got corrupted in megatec_usb when read up in 8-bytes chunks, and why does
the device hang when using original blazer_usb with buffer-flush and 8-byte
chunks read.

To try to make things more clear I had reverted blazer_usb.c back to SVN HEAD
and removed from the code the part that flushes the device buffer before reading
up a command. The result was the device returned back one correct reply on Q1
query but then hanged. Running blazer_usb under ltrace showed that there were
some additional characters left in device buffer after the '\r', so I had added
buffer flush code back but placed it after the buffer read. Nothing changed,
device still locked up after returning one correct answer on "Q1\r" query.

Next proposal was that libusb version shipped with RHEL4/CentOS4 is a bit
outdated and might be somewhat buggy. So I had downloaded libusb-0.1.12 SRPMS
from CentOS5, built binary RPM and installed it. After that the modified
blazer_usb with buffer flush taking place after the reading of reply no longer
made the device hang.

I had reverted blazer_usb back to SVN HEAD once again and it also started to
work normally. However megatec_usb data corruption on receive wasn't cured by
libusb update.

So, the conclusion is as following:

1. blazer_usb driver from SVN HEAD works correctly with USB Ippon UPSes on
CentOS4, but require libusb version update. Version 0.1.12 from CentOS5 will do.

2. megatec_usb driver somehow fails to read up correct data from USB device. The
cause of data corruption is unclear for me as it seems that the steps
megatec_usb driver takes to read up data are almost the same as blazer_usb take.

3. It is possible to make blazer_usb work with RHEL4 version of libusb (0.1.8),
but it require modifications to the way it work. The patch that works on my
setup is attached to this message.

Hope this info will help people to get Ippon UPSes with USB connection to a
server work normally in conjunction with RHEL4/CentOS4.

- --
Best regards,
Alexey Loukianov mailto:***@mail.ru
System Engineer, Mob.:+7(926)218-1320
*nix Specialist
Arjen de Korte
2010-01-06 20:04:00 UTC
Permalink
Post by Alexey Loukianov
1. megatec_usb and blazer_usb both read up data in 8-byte chunks.
Winpower do it in one large chunk.
This depends on what the interrupt endpoint supports. Both megatec_usb
and blazer_usb can be used for widely different UPS makes and models.
Not all support reading more than 8 bytes at a time (in fact, most
don't). See the output of 'lsusb -vvv' for your UPS and lookup the
following:

Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0008 1x 8 bytes

The collection of data when reading from the interrupt endpoint is
possibly dealt with through some libusb implementations, but it is
safer to do this in the driver itself.
Post by Alexey Loukianov
2. blazer_usb tries to flush read buffer of controller before
sending a command.
This is needed by some devices supported through this driver.
Post by Alexey Loukianov
In the end I've got working blazer_usb on CentOS4 with stock libusb (0.1.8). Key
point was to read up data in one 64 byte chunk. It is still unclear why does the
data got corrupted in megatec_usb when read up in 8-bytes chunks, and why does
the device hang when using original blazer_usb with buffer-flush and 8-byte
chunks read.
Most likely, the interrupt endpoint stalls. Since the flushing so far
didn't clear stall conditions, it may hang. Improved now, thanks for
the hint.
Post by Alexey Loukianov
1. blazer_usb driver from SVN HEAD works correctly with USB Ippon UPSes on
CentOS4, but require libusb version update. Version 0.1.12 from CentOS5 will do.
Good. Thanks for the report.
Post by Alexey Loukianov
2. megatec_usb driver somehow fails to read up correct data from USB device. The
cause of data corruption is unclear for me as it seems that the steps
megatec_usb driver takes to read up data are almost the same as blazer_usb take.
There is an error in the way how the megatec_usb driver deals with
replies shorter than the 8 bytes it asks for.
Post by Alexey Loukianov
3. It is possible to make blazer_usb work with RHEL4 version of libusb (0.1.8),
but it require modifications to the way it work. The patch that works on my
setup is attached to this message.
Thanks for the patch, but that will badly break other devices with the
same VID:PID combination. Therefor, this is not an option.

Best regards, Arjen
--
Please keep list traffic on the list
Alexey Loukianov
2010-01-07 05:13:23 UTC
Permalink
Post by Arjen de Korte
Post by Alexey Loukianov
1. megatec_usb and blazer_usb both read up data in 8-byte chunks.
Winpower do it in one large chunk.
This depends on what the interrupt endpoint supports. Both megatec_usb
and blazer_usb can be used for widely different UPS makes and models.
Not all support reading more than 8 bytes at a time (in fact, most
don't). See the output of 'lsusb -vvv' for your UPS and lookup the
Well, that is true and false in the same time. The trouble is that usb-to-serial
implementations that are used in UPSes are totally messed up when it comes to
the compliance with USB standards regarding to HID. So it might be either true
or not that the real MAX_READ_WRITE for device equals to reported 8 bytes.
Winpower software on linux does reads in a chunks larger than 8 bytes and it is
proven to work.
Post by Arjen de Korte
The collection of data when reading from the interrupt endpoint is
possibly dealt with through some libusb implementations, but it is
safer to do this in the driver itself.
Inspecting libusb 0.1.12 sources shows that interrupt reads and control channel
writes are done trough URB queries, and MAX_READ_WRITE is in fact set as a
define to 16KB:

#define MAX_READ_WRITE (16 * 1024)

In case of libusb-0.1.12 reading data in 8-bytes chunks works normally with
Ippon UPSes (unsure about other brands - it should be tested before doing any
commits to SVN), but when using libusb-0.1.8 that is the stock version for RHEL4
reading in 8-bytes chunks makes phoenix usb-to-serial device in Ippon UPSes
hang. I've done some more tests today and it looks like that the real output
buffer size for this UPSes is 64 bytes, and the cause for hang is an attempt to
read from buffer when it hadn't been filled yet. I.e., this UPSes fill output
buffer in 64-byte chunks after receiving commands Q1 and F and they expect it to
be read in one chunk.
Post by Arjen de Korte
Post by Alexey Loukianov
2. blazer_usb tries to flush read buffer of controller before
sending a command.
This is needed by some devices supported through this driver.
Sure. With libusb-0.1.12 it is also needed for Ippon UPSes to return correct
replies. Unfortunately this flushing hangs device when using libusb 0.1.8 :-(.
Post by Arjen de Korte
Most likely, the interrupt endpoint stalls. Since the flushing so far
didn't clear stall conditions, it may hang. Improved now, thanks for
the hint.
If you're thinking about usb_clear_halt as a measure to un-hang the device then
this is not an option. I've tried it in a first place without success. The only
way to bring device back to life in a software way was to do usb_reset and then
reenumerate usbfs and reopen device again.
Post by Arjen de Korte
Post by Alexey Loukianov
The cause of data corruption is unclear for me as it seems that the steps
megatec_usb driver takes to read up data are almost the same as blazer_usb take.
There is an error in the way how the megatec_usb driver deals with
replies shorter than the 8 bytes it asks for.
Hmm. That's strange. I've been playing with megatec_usb sources for quite a long
time for now and I'm pretty sure that the data got corrupted somewhere outside
of megatec_usb. Looks like that it happens to be already messed up just after
the usb_interrupt_read call so megatec_usb has nothing to do with it. Am I
missing something? I've been thinking about spending some time with GDB to make
things more clear but I don't like to do gdb debugs off-site (i.e. over SSH,
being far away from the server on the other side of the Moscow) and I'm going to
visit that office on the next week.
Post by Arjen de Korte
Post by Alexey Loukianov
3. It is possible to make blazer_usb work with RHEL4 version of libusb (0.1.8),
but it require modifications to the way it work. The patch that works on my
setup is attached to this message.
Thanks for the patch, but that will badly break other devices with the
same VID:PID combination. Therefor, this is not an option.
Well, for now it is possible to add this kind of behavior as an additional usb
subdriver that is not selected by default for this VID:PID combination. People
who need it may activate its usage manually in configs while other will happily
use original phoenix subdriver. If you want I may redo patch in that way. But in
any case before doing this I'm going to make some additional tests as while it
is proven to be working with Q1 and F commands the rest wasn't tested yet. Some
good hopes gives that fact that issuing command upsupported by Ippon UPS like
"I\r" didn't break anything (device didn't stall) so I hope that the rest will
work as expected also. But I will check it just to be sure.

- --
Best regards,
Alexey Loukianov mailto:***@mail.ru
System Engineer, Mob.:+7(926)218-1320
*nix Specialist
Arjen de Korte
2010-01-07 08:55:26 UTC
Permalink
Post by Alexey Loukianov
Well, that is true and false in the same time. The trouble is that usb-to-serial
implementations that are used in UPSes are totally messed up when it comes to
the compliance with USB standards regarding to HID.
That is *very* true. There are only a few vendors that took the effort
to make their USB to serial converters in the UPS compliant with
usbserial (PowerCOM being one of them).
Post by Alexey Loukianov
So it might be either true
or not that the real MAX_READ_WRITE for device equals to reported 8 bytes.
That's the reason why we stick to 8 bytes. Devices that support
reading 64 bytes at a time will also allow to read in 8 chunks of 8
bytes (but not the other way around).
Post by Alexey Loukianov
Winpower software on linux does reads in a chunks larger than 8
bytes and it is proven to work.
On this particular UPS. But it is also proven *not* to work on devices
with the same VID:PID combination. I tried that before and got burned
(see the archives).
Post by Alexey Loukianov
In case of libusb-0.1.12 reading data in 8-bytes chunks works normally with
Ippon UPSes (unsure about other brands - it should be tested before doing any
commits to SVN),
Not easily. We don't have access to all these devices. Most of the
reports about drivers being broken are from users from the field, from
whom we usually never hear again until something breaks. I'm not
willing to take the risk to break this driver, if there are good
alternatives around.
Post by Alexey Loukianov
but when using libusb-0.1.8 that is the stock version for RHEL4
reading in 8-bytes chunks makes phoenix usb-to-serial device in Ippon UPSes
hang.
In that case, we don't support this device with libusb-0.1.8. I don't
think we should consider patches that might break other devices that
are supported now, to add support for a library that is almost six
years old. The fact that even libusb-0.1.12 is almost four years old
tells me that we should probably up our minimum libusb requirements to
that instead.

Best regards, Arjen
--
Please keep list traffic on the list
Alexey Loukianov
2010-01-07 09:46:52 UTC
Permalink
Post by Arjen de Korte
Not easily. We don't have access to all these devices. Most of the
reports about drivers being broken are from users from the field, from
whom we usually never hear again until something breaks. I'm not
willing to take the risk to break this driver, if there are good
alternatives around.
Again: we do not need to break existing phoenix driver. Current implementation
of blazer_usb allows addition of any new subdriver that can simply co-exist with
existing ones. Why not to add new subdriver and name it - say - ipponrhel4?
It will not break compatibility with the devices that are already working
normally but will present people on CentOS4/RHEL4 with an alternative variant in
case their device is not working correctly with original phoenix subdriver.
Post by Arjen de Korte
Post by Alexey Loukianov
but when using libusb-0.1.8 that is the stock version for RHEL4
reading in 8-bytes chunks makes phoenix usb-to-serial device in Ippon UPSes
hang.
In that case, we don't support this device with libusb-0.1.8. I don't
think we should consider patches that might break other devices that
are supported now, to add support for a library that is almost six
years old. The fact that even libusb-0.1.12 is almost four years old
tells me that we should probably up our minimum libusb requirements to
that instead.
That is correct, but there is one bad consequence when taking such approach. EOL
for RHEL4 (and thus CentOS4, Scientific Linux 4 and all the other RHEL4 clones)
is Feb-2012. It means that RHEL4 and CentOS4 will be actively used for at least
next two years. People who are not subscribers of paid Red Hat or CentOS network
support will easily bump libusb version up to 0.1.12 and will be happy. Rest
will be forced either to lost paid support services from RedHat/CentOS (support
services agreement specifically denies updating of any core system component -
and libusb is one of such components - to a version not supplied/supported by
RedHad), or not to use the device they might be able to use normally in case NUT
will incorporate special patch for libusb-0.1.8-RHEL4 compatibility. Me
personally isn't affected by this at all: I've got no problems in patching NUT
myself and building up special binary RPM package for the servers I administer.
People in question are all that system administrators worldwide that are still
using RHEL4/CentOS and are experiencing troubles with their Ippon (and possibly
others) UPSes when connecting them to a server through USB interface.

- --
Best regards,
Alexey Loukianov mailto:***@mail.ru
System Engineer, Mob.:+7(926)218-1320
*nix Specialist
Arjen de Korte
2010-01-07 12:28:09 UTC
Permalink
Post by Alexey Loukianov
Again: we do not need to break existing phoenix driver. Current implementation
of blazer_usb allows addition of any new subdriver that can simply co-exist with
existing ones. Why not to add new subdriver and name it - say - ipponrhel4?
That is an option (although this was not what your patch originally
did). Can you checkout the latest version from the SVN trunk?

Best regards, Arjen
--
Please keep list traffic on the list
Alexey Loukianov
2010-01-07 13:11:00 UTC
Permalink
Post by Arjen de Korte
That is an option (although this was not what your patch originally
did). Can you checkout the latest version from the SVN trunk?
Will do it this night or tomorrow morning: leaving to party now as we're still
celebrating the new year and Christmas here in Russia. Holidays this year lasts
for about two weeks, till 10th of January - way tooo many for my taste!

- --
Best regards,
Alexey Loukianov mailto:***@mail.ru
System Engineer, Mob.:+7(926)218-1320
*nix Specialist
Alexey Loukianov
2010-01-08 09:35:09 UTC
Permalink
Post by Arjen de Korte
That is an option (although this was not what your patch originally
did). Can you checkout the latest version from the SVN trunk?
Checked out, going to give changes a try.
There were another problem with autoconf I had to fixed when been trying to build nut-trunk on CentOS4:

[***@server nut-trunk]$ autoconf -f
configure.in:84: error: possibly undefined macro: AC_TYPE_LONG_DOUBLE
If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.
configure.in:85: error: possibly undefined macro: AC_TYPE_LONG_LONG_INT

Looks like there are references in configure.in to a macros AC_TYPE_* that are not known
to RHEL4 autoconf version. Removing appropriate AC_CHECK_FUNCS call from configure.in
don't break anything for RHEL4 as far as I had tested:

[***@server nut-trunk]$ svn diff configure.in
Index: configure.in
===================================================================
- --- configure.in (revision 2240)
+++ configure.in (working copy)
@@ -78,13 +78,6 @@
AC_CHECK_FUNCS(seteuid setsid getpassphrase)
AC_CHECK_FUNCS(on_exit)

- -dnl the following may add stuff to LIBOBJS (is this still needed?)
- -AC_CHECK_FUNCS(vsnprintf snprintf, [], [
- - AC_LIBOBJ(snprintf)
- - AC_TYPE_LONG_DOUBLE
- - AC_TYPE_LONG_LONG_INT
- -])
- -
AC_REPLACE_FUNCS(setenv inet_aton strerror atexit)

dnl

- -------------------------------------------------------------------

As for modified driver, running on CentOS4 with updated libusb (0.1.12):

[***@server drivers]# ./blazer_usb -u root -DDDD -a ippon-ext-hdd1 -x subdriver=ippon -x runtimecal=60,100,378,50 -x
novendor 2>&1 | tee blazer_usb_ippon.log
Network UPS Tools - Megatec/Q1 protocol USB driver 0.03 (2.4.1-2240M)
0.000000 debug level is '4'
0.001677 Checking device (0000/0000) (003/001)
0.001772 - VendorID: 0000
0.001781 - ProductID: 0000
0.001788 - Manufacturer: Linux 2.6.9-67.0.4.EL ohci_hcd
0.001795 - Product: OHCI Host Controller
0.001802 - Serial Number: 0000:00:03.1
0.001809 - Bus: 003
0.001815 Trying to match device
0.001879 Device does not match - skipping
0.001898 Checking device (06DA/0003) (002/002)
0.028550 - VendorID: 06da
0.028568 - ProductID: 0003
0.028576 - Manufacturer: OMRON
0.028583 - Product: USB UPS
0.028590 - Serial Number: unknown
0.028597 - Bus: 002
0.028604 Trying to match device
0.028684 Device matches
0.036587 Trying megatec protocol...
0.042565 send: Q1
0.298225 read: (21
0.298349 blazer_status: parsing failed
0.298360 Status read 1 failed
0.302562 send: Q1
0.554193 read: (21
0.554249 blazer_status: parsing failed
0.554259 Status read 2 failed
0.558556 send: Q1
0.810160 read: (21
0.810213 blazer_status: parsing failed
0.810223 Status read 3 failed
0.810230 Trying mustek protocol...
0.814550 send: QS
1.210108 read: QS
1.210137 blazer_status: invalid start character [51]
1.210146 Status read 1 failed
1.214541 send: QS
1.610055 read: QS
1.610083 blazer_status: invalid start character [51]
1.610092 Status read 2 failed
1.614533 send: QS
2.010005 read: QS
2.010031 blazer_status: invalid start character [51]
2.010039 Status read 3 failed
2.010048 Trying megatec/old protocol...
2.015522 send: D
2.409953 read: D
2.409982 blazer_status: invalid start character [44]
2.409990 Status read 1 failed
2.414513 send: D
2.809903 read: D
2.809931 blazer_status: invalid start character [44]
2.809940 Status read 2 failed
2.814506 send: D
3.209851 read: D
3.209880 blazer_status: invalid start character [44]
3.209888 Status read 3 failed
3.209896 No supported UPS detected


[***@server drivers]# ./blazer_usb -u root -DDDD -a ippon-ext-hdd1 -x subdriver=phoenix -x runtimecal=60,100,378,50 -x
novendor 2>&1 | tee blazer_usb_phoenix.log
0.000000 debug level is '4'
0.001716 Checking device (0000/0000) (003/001)
0.001799 - VendorID: 0000
0.001808 - ProductID: 0000
0.001815 - Manufacturer: Linux 2.6.9-67.0.4.EL ohci_hcd
0.001822 - Product: OHCI Host Controller
0.001829 - Serial Number: 0000:00:03.1
0.001835 - Bus: 003
0.001842 Trying to match device
0.001905 Device does not match - skipping
0.001924 Checking device (06DA/0003) (002/002)
0.006763 - VendorID: 06da
0.006782 - ProductID: 0003
0.006790 - Manufacturer: unknown
0.006797 - Product: unknown
0.006803 - Serial Number: unknown
0.006811 - Bus: 002
0.006818 Trying to match device
0.006895 Device matches
0.009805 Trying megatec protocol...
0.012076 dump: (8 bytes) => 2e 38 20 30 30 37 20 35
0.028078 dump: (8 bytes) => 30 2e 31 20 31 33 2e 35
1.029744 flush: could not set alt intf 0/0: Connection timed out
1.033743 send: Q1
1.243910 read: (218.8 218.3 218.3 007 50.1 13.5 25.0 00001001
1.244117 Status read in 1 tries
1.244129 Supported UPS detected with megatec protocol
1.259917 dump: (8 bytes) => 2e 33 20 30 30 37 20 35
1.276294 dump: (8 bytes) => 30 2e 31 20 31 33 2e 35
2.277705 flush: could not set alt intf 0/0: Connection timed out
2.281713 send: F
2.395767 read: #220.0 003 12.00 50.0
2.395854 Ratings read in 1 tries
2.395932 battery runtime exponent : 2.655
2.395946 battery runtime nominal : 60.0
2.395970 battery runtime estimate : 49.8
2.395980 No charge time specified, using built in default [43200 seconds]
2.395989 No idle load specified, using built in default [10.0 %]
2.411773 dump: (8 bytes) => 00 00 00 00 00 00 00 00
2.427768 dump: (8 bytes) => 23 32 32 30 2e 30 20 30
2.443765 dump: (8 bytes) => 30 33 20 31 32 2e 30 30
2.459765 dump: (8 bytes) => 20 35 30 2e 30 0d 00 00
2.475763 dump: (8 bytes) => 00 00 00 00 00 00 00 00
3.476682 flush: could not set alt intf 0/0: Connection timed out
3.481688 send: Q1
3.692596 read: (218.3 218.8 218.3 007 50.1 13.5 25.0 00001001
3.692935 dstate_init: sock /var/lib/ups/blazer_usb-ippon-ext-hdd1 open on fd 8
3.708613 dump: (8 bytes) => 2e 33 20 30 30 37 20 35
3.724599 dump: (8 bytes) => 30 2e 31 20 31 33 2e 35
4.725652 flush: could not set alt intf 0/0: Connection timed out
4.729662 send: Q1
4.956434 read: (218.3 218.3 218.3 008 50.1 13.5 25.0 00001001
5.708354 dump: (8 bytes) => 2e 33 20 30 30 38 20 35
5.724344 dump: (8 bytes) => 30 2e 31 20 31 33 2e 35
6.725615 flush: could not set alt intf 0/0: Connection timed out
6.729617 send: Q1
6.956178 read: (218.8 218.8 218.3 007 50.1 13.5 25.0 00001001
7.708098 dump: (8 bytes) => 2e 33 20 30 30 37 20 35
7.724087 dump: (8 bytes) => 30 2e 31 20 31 33 2e 35
8.725568 flush: could not set alt intf 0/0: Connection timed out
8.730575 send: Q1
8.955921 read: (218.3 218.3 218.3 007 50.1 13.5 25.0 00001001
9.707842 dump: (8 bytes) => 2e 33 20 30 30 37 20 35
9.723831 dump: (8 bytes) => 30 2e 31 20 31 33 2e 35
10.725522 flush: could not set alt intf 0/0: Connection timed out
10.730531 send: Q1
10.955665 read: (218.3 218.3 218.3 006 50.1 13.5 25.0 00001001
11.707589 dump: (8 bytes) => 2e 33 20 30 30 36 20 35
11.723575 dump: (8 bytes) => 30 2e 31 20 31 33 2e 35
12.724479 flush: could not set alt intf 0/0: Connection timed out
12.728489 send: Q1
12.940411 read: (218.8 218.3 218.8 007 50.1 13.5 25.0 00001001
13.708324 dump: (8 bytes) => 2e 38 20 30 30 37 20 35
13.724452 dump: (8 bytes) => 30 2e 31 20 31 33 2e 35
.... and so on ....

Haven't got time at the moment to figure out what's wrong with ippon subdriver, will try to do it this evening.

- --
Best regards,
Alexey Loukianov mailto:***@mail.ru
System Engineer, Mob.:+7(926)218-1320
*nix Specialist
Arjen de Korte
2010-01-08 09:59:13 UTC
Permalink
Post by Alexey Loukianov
configure.in:84: error: possibly undefined macro: AC_TYPE_LONG_DOUBLE
If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.
configure.in:85: error: possibly undefined macro: AC_TYPE_LONG_LONG_INT
Which version of autoconf is installed on RHEL4? Note that you can
also grab the tarballs from the Buildbots, in which case you don't
need to run autoconf, but can call configure right away.

http://buildbot.ghz.cc/public/nut/

[...]
Post by Alexey Loukianov
Haven't got time at the moment to figure out what's wrong with ippon
subdriver, will try to do it this evening.
We probably need a small delay between sending a command and reading
the reply. Currently it seems to read just three characters. Since we
only try reading once (instead of repeatedly until a timeout or end of
line occurs), this probably fails.

Best regards, Arjen
--
Please keep list traffic on the list
Alexey Loukianov
2010-01-09 10:08:54 UTC
Permalink
Post by Arjen de Korte
We probably need a small delay between sending a command and reading
the reply. Currently it seems to read just three characters. Since we
only try reading once (instead of repeatedly until a timeout or end of
line occurs), this probably fails.
Well, looks like data corruption from megatec_usb somehow got into blazer_usb.

Here are my debug additions to ippon_command:

Index: blazer_usb.c
===================================================================
- --- blazer_usb.c (revision 2240)
+++ blazer_usb.c (working copy)
@@ -170,10 +170,11 @@

static int ippon_command(const char *cmd, char *buf, size_t buflen)
{
- - char tmp[64];
+ char tmp[65], tmp1[65];
int ret;
size_t i;

+ memset(tmp, 0, sizeof(tmp));
snprintf(tmp, sizeof(tmp), "%s", cmd);

for (i = 0; i < strlen(tmp); i += ret) {
@@ -190,8 +191,10 @@

upsdebugx(3, "send: %.*s", (int)strcspn(tmp, "\r"), tmp);

+ usleep(200000);
+
/* Read all 64 bytes of the reply in one large chunk */
- - ret = usb_interrupt_read(udev, 0x81, tmp, sizeof(tmp), 1000);
+ ret = usb_interrupt_read(udev, 0x81, tmp, 64, 2000);

/*
* Any errors here mean that we are unable to read a reply (which
@@ -202,6 +205,13 @@
return ret;
}

+ tmp[ret] = 0;
+ for(i = 0; i < ret; i++) {
+ tmp1[i] = (tmp[i] >= ' ') ? tmp[i] : '^';
+ }
+ tmp1[i] = 0;
+ upsdebugx(3, "read(raw) %d bytes: %s", ret, tmp1);
+
snprintf(buf, sizeof(buf), "%.*s", ret, tmp);

upsdebugx(3, "read: %.*s", (int)strcspn(buf, "\r"), buf);
- -------------------------------------------------------------------------


And here is part of the output:
.........
0.033782 Trying megatec protocol...
0.038754 send: Q1
0.357153 read(raw) 64 bytes: (222.3 2001001^2.3 007 50.1 13.8 25.0 00001001^2.3 007 50.1 13.8
0.357182 read: (22
0.357284 blazer_status: parsing failed
0.357296 Status read 1 failed
..............


P.S. I had revised svn diff from r2219 till current HEAD and I can't find any principal difference in code
that makes my older patch to work correctly while your ippon_command subdriver to fail. Looks
like some kind of a mystery! I'm going to try totally different approach in debugging situation:
to create small and totally separate from the NUT program which will just connect to USB UPS
and query it with Q1 and I commands. In case it will work reliably and correctly - that would be a
good sign that it is time to do NUT code audit for incorrect pointer operations and other things of
such kind. That is the only way I can explain this strange behavior and sudden data corruption.

P.P.S. RHEL4 GNU autotools versions:

[***@server nut-trunk]$ rpm -q automake autoconf
automake-1.9.2-3
autoconf-2.59-5

- --
Best regards,
Alexey Loukianov mailto:***@mail.ru
System Engineer, Mob.:+7(926)218-1320
*nix Specialist
Alexey Loukianov
2010-01-12 09:01:44 UTC
Permalink
Taken one of these devices home today, going to do debugs this evening.

- --
Best regards,
Alexey Loukianov mailto:***@mail.ru
System Engineer, Mob.:+7(926)218-1320
*nix Specialist
Alexey Loukianov
2010-01-12 14:20:00 UTC
Permalink
Post by Arjen de Korte
We probably need a small delay between sending a command and reading
the reply. Currently it seems to read just three characters. Since we
only try reading once (instead of repeatedly until a timeout or end of
line occurs), this probably fails.
Gotcha!
Here is diff that fixes error and makes ippon subdriver to work normally on
RHEL4 (and Fedora8 also, tested on my home PC):

[***@server drivers]$ svn diff blazer_usb.c
Index: blazer_usb.c
===================================================================
- --- blazer_usb.c (revision 2240)
+++ blazer_usb.c (working copy)
@@ -202,7 +202,7 @@
return ret;
}

- - snprintf(buf, sizeof(buf), "%.*s", ret, tmp);
+ snprintf(buf, buflen, "%.*s", ret, tmp);

upsdebugx(3, "read: %.*s", (int)strcspn(buf, "\r"), buf);
return ret;

I think, this bug is pretty self-explaining.

- --
Best regards,
Alexey Loukianov mailto:***@mail.ru
System Engineer, Mob.:+7(926)218-1320
*nix Specialist
Arjen de Korte
2010-01-12 14:31:34 UTC
Permalink
Post by Alexey Loukianov
Here is diff that fixes error and makes ippon subdriver to work normally on
Index: blazer_usb.c
===================================================================
- --- blazer_usb.c (revision 2240)
+++ blazer_usb.c (working copy)
@@ -202,7 +202,7 @@
return ret;
}
- - snprintf(buf, sizeof(buf), "%.*s", ret, tmp);
+ snprintf(buf, buflen, "%.*s", ret, tmp);
upsdebugx(3, "read: %.*s", (int)strcspn(buf, "\r"), buf);
return ret;
I think, this bug is pretty self-explaining.
It sure is (I rest my case). I've committed the change to the trunk. :-)

Best regards, Arjen
--
Please keep list traffic on the list
Alexey Loukianov
2010-01-12 14:46:48 UTC
Permalink
Post by Arjen de Korte
It sure is (I rest my case). I've committed the change to the trunk. :-)
That's goog, all seems to work normally now.
Another thing to consider is an addition of usb_reset() to the USB UPS detection
core (prior to opening USB UPS) - it will 'cure' stalled Ippon UPS in case it
got hanged somehow.

- --
Best regards,
Alexey Loukianov mailto:***@mail.ru
System Engineer, Mob.:+7(926)218-1320
*nix Specialist
Arjen de Korte
2010-01-12 19:17:20 UTC
Permalink
Post by Alexey Loukianov
Another thing to consider is an addition of usb_reset() to the USB UPS detection
core (prior to opening USB UPS) - it will 'cure' stalled Ippon UPS in case it
got hanged somehow.
Replugging will also, so I don't think this is something vital (we
never got any reports to that matter either). What error will it
return it when it is stalled? Post the output of the driver starting
up a stalled device with debug level set to -DDD.

Best regards, Arjen
--
Please keep list traffic on the list
Alexey Loukianov
2010-01-12 19:38:19 UTC
Permalink
Post by Arjen de Korte
Replugging will also, so I don't think this is something vital (we
never got any reports to that matter either). What error will it
return it when it is stalled? Post the output of the driver starting
up a stalled device with debug level set to -DDD.
Errors may be different. I've seen EPIPE while been debugging, and now I've
forced the device to stall and get the following blazer_usb run log:

[***@lx2linux bin]$ sudo ./blazer_usb -u root -DDD -a ippon-ext-hdd1 -x
subdriver=ippon -x runtimecal=60,100,378,50 -x novendor
Network UPS Tools - Megatec/Q1 protocol USB driver 0.03 (2.4.1-2240)
0.000000 debug level is '3'
0.000614 Checking device (1005/B113) (001/006)
0.002221 - VendorID: 1005
0.002231 - ProductID: b113
0.002235 - Manufacturer: USB
0.002240 - Product: FLASH DRIVE
0.002363 - Serial Number: B509112400001042
0.002372 - Bus: 001
0.002378 Trying to match device
0.002533 Device does not match - skipping
0.002610 Checking device (05E3/0608) (001/003)
0.004475 - VendorID: 05e3
0.004492 - ProductID: 0608
0.004496 - Manufacturer: unknown
0.004502 - Product: USB2.0 Hub
0.004507 - Serial Number: unknown
0.004511 - Bus: 001
0.004734 Trying to match device
0.004818 Device does not match - skipping
0.004832 Checking device (0AC8/3450) (001/002)
0.007496 - VendorID: 0ac8
0.007514 - ProductID: 3450
0.007519 - Manufacturer: A4 TECH Corp.
0.007525 - Product: A4 TECH USB2.0 PC Camera E
0.007529 - Serial Number: unknown
0.007727 - Bus: 001
0.007733 Trying to match device
0.007822 Device does not match - skipping
0.007941 Checking device (1D6B/0002) (001/001)
0.008027 - VendorID: 1d6b
0.008035 - ProductID: 0002
0.008040 - Manufacturer: Linux 2.6.26.8-57.fc8PAE ehci_hcd
0.008120 - Product: EHCI Host Controller
0.008202 - Serial Number: 0000:00:02.1
0.008210 - Bus: 001
0.008214 Trying to match device
0.008293 Device does not match - skipping
0.008374 Checking device (06DA/0003) (002/004)
0.012500 - VendorID: 06da
0.012512 - ProductID: 0003
0.012518 - Manufacturer: unknown
0.012523 - Product: unknown
0.012528 - Serial Number: unknown
0.012861 - Bus: 002
0.012868 Trying to match device
0.013039 Device matches
0.015499 Trying megatec protocol...
0.018490 send: error sending control message: Timer expired
0.018528 blazer_status: short reply
0.018550 Status read 1 failed
0.021484 send: error sending control message: Timer expired
0.021521 blazer_status: short reply
0.021547 Status read 2 failed
0.024493 send: error sending control message: Timer expired
0.024532 blazer_status: short reply
0.024553 Status read 3 failed
0.024577 Trying mustek protocol...
0.027585 send: error sending control message: Timer expired
0.027625 blazer_status: short reply
0.027649 Status read 1 failed
0.030498 send: error sending control message: Timer expired
0.030534 blazer_status: short reply
0.030559 Status read 2 failed
0.033492 send: error sending control message: Timer expired
0.033528 blazer_status: short reply
0.033553 Status read 3 failed
0.033574 Trying megatec/old protocol...
0.036485 send: error sending control message: Timer expired
0.036653 blazer_status: short reply
0.036673 Status read 1 failed
0.039718 send: error sending control message: Timer expired
0.039802 blazer_status: short reply
0.039839 Status read 2 failed
0.042493 send: error sending control message: Timer expired
0.042511 blazer_status: short reply
0.042517 Status read 3 failed
0.042521 No supported UPS detected

Corresponding entries from dmesg:
usb 2-3: usbfs: USBDEVFS_CONTROL failed cmd blazer_usb rqt 128 rq 6 len 255 ret -62
usb 2-3: usbfs: USBDEVFS_CONTROL failed cmd blazer_usb rqt 128 rq 6 len 255 ret -62
usb 2-3: usbfs: USBDEVFS_CONTROL failed cmd blazer_usb rqt 33 rq 9 len 8 ret -62
usb 2-3: usbfs: USBDEVFS_CONTROL failed cmd blazer_usb rqt 33 rq 9 len 8 ret -62
usb 2-3: usbfs: USBDEVFS_CONTROL failed cmd blazer_usb rqt 33 rq 9 len 8 ret -62
usb 2-3: usbfs: USBDEVFS_CONTROL failed cmd blazer_usb rqt 33 rq 9 len 8 ret -62
usb 2-3: usbfs: USBDEVFS_CONTROL failed cmd blazer_usb rqt 33 rq 9 len 8 ret -62
usb 2-3: usbfs: USBDEVFS_CONTROL failed cmd blazer_usb rqt 33 rq 9 len 8 ret -62
usb 2-3: usbfs: USBDEVFS_CONTROL failed cmd blazer_usb rqt 33 rq 9 len 8 ret -62
usb 2-3: usbfs: USBDEVFS_CONTROL failed cmd blazer_usb rqt 33 rq 9 len 8 ret -62
usb 2-3: usbfs: USBDEVFS_CONTROL failed cmd blazer_usb rqt 33 rq 9 len 8 ret -62

So, in case there's an EPIPE condition which can't be cleared using
usb_clear_halt(), or there's "Timer expired" condition (errno -62 - don't
remember what is the correct E-alias for it) - it might be worth try to do
usb_reset on device as a last resort (and then to re-enumerate the bus and
reopen the device). If it don't help then it is either something totally wrong
with the device or it's time to stand up and went to a place where the box
resides to replug the cable.

- --
Best regards,
Alexey Loukianov mailto:***@mail.ru
System Engineer, Mob.:+7(926)218-1320
*nix Specialist
Arjen de Korte
2010-01-13 13:16:12 UTC
Permalink
Post by Alexey Loukianov
So, in case there's an EPIPE condition which can't be cleared using
usb_clear_halt(), or there's "Timer expired" condition (errno -62 - don't
remember what is the correct E-alias for it)
That's usually ETIME.
Post by Alexey Loukianov
- it might be worth try to do
usb_reset on device as a last resort (and then to re-enumerate the bus and
reopen the device).
This is a bit tricky. I'm not sure if we can still (or need to) call
usb_close() on the device handle after that. This might be very much
depending on the OS in use. On Linux you should not call usb_close()
in some cases, since that might cause a double free corruption and
should in other cases. So if we can avoid it, that would probably be a
safer option (although I added something to the trunk that tries
anyway).

Best regards, Arjen
--
Please keep list traffic on the list
Alexey Loukianov
2010-01-13 13:58:29 UTC
Permalink
Post by Arjen de Korte
This is a bit tricky. I'm not sure if we can still (or need to) call
usb_close() on the device handle after that. This might be very much
depending on the OS in use. On Linux you should not call usb_close()
in some cases, since that might cause a double free corruption and
should in other cases. So if we can avoid it, that would probably be a
safer option (although I added something to the trunk that tries
anyway).
This is surely system-dependent, but it seems that while device handle becomes
invalid just after the call to usb_reset() (so bus should be re-enumarated and a
new call to usb_open() is required) you still need to call usb_close() to free
heap mem allocated by libusb and to close the file descriptor.

Just tested it on Fedora 8 gives no double free corruption.

Another way to reset a device (I used to use it instead of usb_reset() as it is
more generic) is as follows:

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <sys/ioctl.h>

#include <linux/usbdevice_fs.h>


int main(int argc, char **argv)
{
const char *filename;
int fd;
int rc;

if (argc != 2) {
fprintf(stderr, "Usage: usbreset device-filename\n");
return 1;
}
filename = argv[1];

fd = open(filename, O_WRONLY);
if (fd < 0) {
perror("Error opening output file");
return 1;
}

printf("Resetting USB device %s\n", filename);
rc = ioctl(fd, USBDEVFS_RESET, 0);
if (rc < 0) {
perror("Error in ioctl");
return 1;
}
printf("Reset successful\n");

close(fd);
return 0;
}

It is totally linux-dependent, so if you will take such approach some IFDEFs
will be required (but they are some already there to avoid double free
corruption anyway).

- --
Best regards,
Alexey Loukianov mailto:***@mail.ru
System Engineer, Mob.:+7(926)218-1320
*nix Specialist

Loading...