Troubleshooting

From LORIX One - Wiki

Jump to: navigation, search

Hardware

HW1 Issue - The LORIX One doesn't start

In the case where the LORIX One doesn't start, please verify the following points:

  • Verify that the connection is correctly made between the gateway and the passive PoE injector following the user manual
  • Be sure you don't use active PoE injector or any other passive PoE injector than the one provided with you LORIX One
  • The LORIX One can't be powered through the USB


Software

SW1 Issue - Error trying to stop the actual packet-forwarder using the clouds-manager script

When trying the stop a running packet-forwarder (no matter which one), an error saying that no process has been killed and the packet-forwarder is sill running.

Symptoms

The clouds-manager status display a running packet-forwarder:

$ /etc/init.d/clouds-manager.sh status
Cloud ttn status:
/opt/lorix/clouds/ttn/poly_pkt_fwd (pid 1008) is running...

But when trying to stop this running process using the clouds-manager:

$ /etc/init.d/clouds-manager.sh stop
Stopping cloud ttn... no /opt/lorix/clouds/ttn/poly_pkt_fwd found; none killed
timeout during wait for stop
no /opt/lorix/clouds/ttn/poly_pkt_fwd found; none killed
/opt/lorix/clouds/ttn/poly_pkt_fwd was not running
done.

This error appears because a process is currently running but the pid file containing the process pid is missing under the /var/run directory and normally when the clouds-manager is configured for autoboot.

Fix

A short term option can be to manually stop the process using its pid as given by the clouds-manager.sh status command:

$ sudo kill -9 1008

However, this will most likely appears again on the next reboot.


The release 22 February 2018 fix this issue (changelog).


SW2 Issue - TTN and Semtech Packet forwarder doesn't boot if internet connection is not made at boot time

When autoboot is configured in the clouds-manager and if the connection to the server is not possible at boot time, the poly_pkt_fwd used for TTN (/opt/lorix/clouds/ttn) and the lora_pkt_fwd from Semtech (/opt/lorix/clouds/packet-forwarder) will die or stay stuck without any chance of contacting the server afterwards. This problem will most likely appear with an internet connection provided through 3G/4G modem which can take more time on boot.

Fix

The release 22 February 2018 fixes this issue (changelog) for the poly_pkt_fwd. We have adapted the poly_pkt_fwd to wait on server available at boot. 

The release 25 March 2018 fixes this issue (changelog) for the lora_pkt_fwd. We have adapted the lora_pkt_fwd to wait on server available at boot.


The packages should then be updated as described on the opkg package system page using the following command:

$ sudo opkg update
$ sudo opkg upgrade

Following the update, the old config file (global_conf.json and local_conf.json of updated packages) will be renamed to *.json.bkp and new default config file will replace them. It then necessary to modify the new configuration files with old parameters.

SW3 Issue - Clouds-manager opkg error during Release 22 February 2018 update

As described in the changelog, the release 22 February 2018 update brings new features but also fix an issue concerning the clouds-manager.sh script. This bug is related to a problem which occurs during its uninstalling. Since it's necessary to uninstall it to update it, the bug fix will produce the bug which will then be fixed for the next update.

Symptoms

When updating the LORIX One using the commands

$ sudo opkg update
$ sudo opkg upgrade

To respectively update the list of available packages and update the actual installed packages, the following error will most likely appear:

Upgrading clouds-manager from 1.0.0-r1 to 1.0.0-r3 on root.
[...]
To remove package debris, try `opkg remove clouds-manager`.
To re-attempt the install, try `opkg install clouds-manager`.
[...]
Collected errors:
 * pkg_run_script: package "clouds-manager" prerm script returned status 2.
 * prerm_upgrade_old_pkg: prerm script for package "clouds-manager" failed

Fix

This problem can be solved easily by forcing the uninstalling and then install the clouds-manager again following these commands when error appears:

$ sudo opkg remove clouds-manager --force-remove
$ sudo opkg install clouds-manager

The results is the following:

Installing clouds-manager (1.0.0-r3) on root.
Cloud loriot already stopped, abort.
update-rc.d: /etc/init.d/init-clouds-manager exists during rc.d purge (continuing)
Removing any system startup links for init-clouds-manager ...
Configuring clouds-manager.
clouds-manager: restarting the stopped service
Starting cloud loriot... done.
Adding system startup for /etc/init.d/init-clouds-manager.
Cloud loriot already running, abort.
Collected errors:
* pkg_get_installed_files: Failed to open ///var/lib/opkg/info/clouds-manager.list: No such file or directory.
* pkg_get_installed_files: Failed to open ///var/lib/opkg/info/clouds-manager.list: No such file or directory.

The generated errors (two last lines) can be ignored.


SW4 Issue - opkg returns wget error 5 during update

When updating the available packages of opkg, an error concerning wget which returned status code 5 appears and update fails.

Symptoms

On update of the opkg available packages:

$ sudo opkg update

The results is the following:

Downloading http://lorixone.io/yocto/feeds/2.1.2/all/Packages.gz.
Downloading http://lorixone.io/yocto/feeds/2.1.2/cortexa5hf-neon/Packages.gz.
Downloading http://lorixone.io/yocto/feeds/2.1.2/sama5d4_lorix_one/Packages.gz.
Downloading http://lorixone.io/yocto/feeds/2.1.2/sama5d4_lorix_one/Packages.gz.
Collected errors:
* opkg_download_backend: Failed to download http://lorixone.io/yocto/feeds/2.1.2/all/Packages.gz, wget returned 5.
* opkg_download_backend: Failed to download http://lorixone.io/yocto/feeds/2.1.2/cortexa5hf-neon/Packages.gz, wget returned 5.
* opkg_download_backend: Failed to download http://lorixone.io/yocto/feeds/2.1.2/sama5d4_lorix_one/Packages.gz, wget returned 5.
* opkg_download_backend: Failed to download http://lorixone.io/yocto/feeds/2.1.2/sama5d4_lorix_one/Packages.gz, wget returned 5.

Quick and temporary fix

This error is related to a SSL verification error from wget and is in fact related to a wrong time set in the LORIX One which is too far from the real time of the server.

A simple fix is to correct the actual time of the LORIX One by using the following commands:

$ sudo date -s "7 JUN 2018 14:35:10"
$ sudo hwclock -w

The first line defines the system date to the 9 june 2018 and the corrected time. The second line is to write the system time into the hardware real time clock of the LORIX One.


Please note that this is just a temporary fix since the problem is coming from the ntp daemon which doesn't update correctly the system time.

Fix

As far as we know, the time synchronisation issue related to the ntp daemon is because the port used by ntpd to contact the time server is closed. To fix this issue please verify that your company/network firewall is correctly configured to accept output connection through the port UDP 123 for the LORIX One and let us know if it doesn't fix the problem.