[SOLVED] Serial problem
-
I seem to have found the root cause of this, now working for 4 days with no corrupted data.
Great news! :thumbs_up:
But I have seen this overnight in the log, anything to worry about?
How often are you seeing this log? If you see this error very frequently kindly report a bug/issue. I do not get any clue why failed to retrieve a node from the database. A particular message will be dropped(without any action) when you face this issue.
It is a holiday weekend here, but I will post the story of how I found and fixed the issue. You can all guess if you like. I bet nobody will get it!
Enjoy your holiday weekend! I am eager to see your fix
-
@skywatch Can you break the surprise?
-
@jkandasa OK - it's a long story though......
So for the last 2 months after upgrading to mysensors 2.3.0 I get the library update pop-up with arduino IDE - It sometimes tells me to upgrade to a lower version of the library than I already have, so I figured it was a bit buggy.
Recently it was telling me I needed to upgrade to the latest mysensors 2.2.0 - But since I already had 2.3.0 installed I ignored it and carried on. After all, all the nodes reported as 2.3.0 and there was only 1 mysensors folder inside the libraries folder.
But when I went into the mysensors examples to clear eeprom, I found 2x mysensors options. They were different. I was baffled. I rechecked the libraries folder, only one mysensors folder. I checked app data, program files, temp files and everywhere I could think of. No other mysensors folders present.
After a lot of searching I found something interesting - In the arduino libraries folder there were a few 'arduino xxxxxx' folders (where xxxxxx was 6 seemingly random digits). Inside one of them were files relating to mysensors 2.1.1. - I deleted this folder and tested the IDE which compiled mysensors sketches with no issues.
With this rouge folder removed I decided to re-flashed the gateway and all nodes. This was the point that all the problems went away (and now even ack works properly)!
So it appears that this was the issue - it explains why the issue was only affecting me, but I still can't figure out what was happening fully as the nodes all reported as 2.3.0 and yet somehow having an older mysensors.h and mysensorscore.h inside a folder that wasn't mysensors in any way seemed to be causing the issue.
Still I found the problem and now have a stable system. It just goes to show that if you don't give up, eventualy you'll make progress.
So a big THANK YOU to you all for the help with this and sorry for taking your time on this one.
It's a learning curve for sure!
-
@skywatch Interesting story
So a big THANK YOU to you all for the help with this and sorry for taking your time on this one.
No worries at all. I have fixed couple of other issues(when working on this) and we have improved MyController quality further
-
@jkandasa Thank you! - I have no idea what created those arduinoxxxxxx folders, there were 4 or 5 of them. A casual look and it's easy to think they are part of the internal arduino ide or something.
Still, at least I may make more progress now.
But not a time for relaxing, I still have FOTA, SWsinging, HWsigning and low power battery nodes to learn! - You have been warned! Hahahaha.....:)
-
-
@jkandasa - I was about to mark this as solved, but this morning while I was asleep it did this to me....... MyC 1.4.0 snapshot installed......
2018-09-03 04:37:57,662 ERROR [Thread-31] [org.mycontroller.standalone.gateway.serial.SerialDataListenerjSerialComm:178] Exception, RawMessage:[รฏยพย51รฏยฟยฟ24;1;1;0;0;24.31] org.mycontroller.standalone.exceptions.MessageParserException: Invalid range for 'nodeId':MessageParserAbstract(gatewayId=1, nodeId=5124, childSensorId=1, messageType=1, ack=0, subType=0, payload=24.31, isTxMessage=false, timestamp=1535945877644) at org.mycontroller.standalone.provider.mysensors.MessageParserAbstract.validate(MessageParserAbstract.java:157) at org.mycontroller.standalone.provider.mysensors.MessageParserAbstract.update(MessageParserAbstract.java:123) at org.mycontroller.standalone.provider.mysensors.MessageParserSerial.getMessage(MessageParserSerial.java:32) at org.mycontroller.standalone.provider.mysensors.MessageParserSerial.getMessage(MessageParserSerial.java:28) at org.mycontroller.standalone.gateway.serial.SerialDataListenerjSerialComm.serialEvent(SerialDriverJSerialComm.java:147) at com.fazecast.jSerialComm.SerialPort$SerialPortEventListener.waitForSerialEvent(SerialPort.java:937) at com.fazecast.jSerialComm.SerialPort$SerialPortEventListener$1.run(SerialPort.java:885) at java.lang.Thread.run(Thread.java:748)
Why didn't I take up an easy hobby like fishing?
-
@skywatch looks like some corruption in received data from your serial port.
-
@jkandasa Today I noticed no data for 2 hours from any node.
I disconnected the pro mini from the pi, waited 20 seconds and then reconnected. The GW reconnected fine with MyC and all node data returned again.
So I guess we can be sure it is either the serial port on the pi locking up or the pro mini serial having an issue.
At least it is proven where the issue is, now to look into it some more. I will get around to trying the 5V pro mini as a GW (with level shifter for serial) just to see if that is the problem. I will also ask the question on the MyS site as well to see if anyone has similar problems.....
-
-
@jkandasa Thanks for the reply.
I do wonder though that if the problem is with the GW serial port then it will be just as likely to fail connected to a serial to usb connection as well as when connected to a pi. Unless the pi is somehow upsetting the serial port on the pro mini in a way that the usb to ttl will not.
I did ask on MySensors forum about using 3.3v pro mini as GW and the problems I experience, but no replies yet.
I'll try and modify things for a 5V pro mini on serial in the next few days and if that doesn't solve the issues I will try the usb converter you suggested (I have one somewhere!)....
Thanks.
-
@skywatch I couldn't find a usb converter so I ordered a couple just in case.
FWIW I also added a temperature controlled 20mm fan to aid the cooling of the pi. Even with the big heatsink posted earlier in this thread, you can see the difference when the fan is activated.
I only used 4 components to make a fan controller that works with a python script to control the speed according to cpu temperature - so virtually noiseless in most circumstances.
Oh, I may have some news about this issue tomorrow as well....... (Queue suspence music from 1950's horror/scifi movie).....
-
You will be happy to know that this issue is now SOLVED!
Since installing the latest SNAPSHOT 1.4.0, all the problems associated with this have gone away.
Phew, what a little sod this problem has been, but we never gave up and all is well now.
Big thanks!
-
@skywatch Great! nice to hear this.
You may disable gateway(after some days) raw messageDEBUG
toINFO
, this log will keep growing. -
@jkandasa After nearly 4 full days without issue I have returned the settings as suggested.
Whilst I am very happy that it is back working again, I guess now we will never know what exactly caused the issue or if it is likely to happen again (to me or others).
For V2.0 you might like to think about a page in the settings area with radio buttons to set the 'INFO' / 'DEBUG' etc parameters. Would be easier, especially for people who are new to linux.
-
NOOOOoooooooooooo..........
Experiencing that the controller was not seemingly getting data reliably from some nodes, I tried to up the GW power from LOW to HIGH. I reflashed the GW last night and set it all back up.
This morning I find the spikes and ghost nodes are all back, along with long periods where node data is not presented on the dashboard.
I have now cleared the GW eeprom and reflashed again with LOW power, just to see what happens. I have rechecked all connections and will see how it goes.
If the issues continue I will re-enable all those DEBUG's and let you know.
-
Now I have USB-serial converter plugged into the pi.
I tested the GW by flashing debug on pro mini and connecting to laptop - all working well and data from sensors seen.
I reflash the GW without debug and connect to the usb-serial and power on pi.
When I access myc I can see the serial GW is up and running. I can see that it has nodes and sensors (maybe inherited from last serial gw?) - But, no data from anything but myagent.
All RF nodes are down. No amount of discover or refresh will show anything. I even deleted a few, but no sign yet that they will re-register.
I even tried swapping Tx/Rx on the usb to serial just in case they printed it wrong.
It made no difference.Is there something I have forgotten to (or didn't know I had to) do?
-
@skywatch Is your gateway up and running in MyController? Can you show your gateway settings in MyController and Gateway Sketch?
-
@jkandasa Sure thing!
GW code is.....
// Enable and select radio type attached #define MY_RADIO_NRF24 #define MY_RF24_CHANNEL (97) #define MY_RF24_PA_LEVEL RF24_PA_LOW #define MY_GATEWAY_SERIAL #define MY_DEFAULT_LED_BLINK_PERIOD 300 #include <MySensors.h> void setup() { // Setup locally attached sensors } void presentation() { // Present locally attached sensors } void loop() { // Send locally attached sensor data here }
And GW settings in myc are.....
And...
Results in
-
@skywatch All looks good. What about
Node 0
(gateway node) status?
Also, runls -lh /dev/ttyUSB*
in your RPI.Irevelent to this issue: You may disable
Stream ack enabled
, this will speed up your OTA update.