Extracting data from an SMA WebBox
Download SMAwebBox_dataExtractor: SMAwebBox_dataExtractor.tar.gz
The SMA “webBox” device has the capacity for logging, web displaying and uploading to a server data from the SMA devices that are connected to it. Typical SMA devices are inverters, controllers, backups, etc. They are specialized in devices that allow feeding into the grid from renewable sources.
The application “SMAwebBox_dataExtractor” has being developed under a GPL license for inserting in a database the data that has been generated by the “webBox” device.
This document describes an environment that has been successfully used with “SMAwebBox_dataExtractor”. It aims to explain how the whole system works, the different elements which are part of it, how to solve the possible problems that can happen and more.
This documentation is for systems using Windows, as the original project had to be run in this operating system. This does not affect to those ones who still want to use the “SMAwebBox_dataExtractor” application with another operating system as it is written in php. If this is the case this documentation is still useful as it explains how “SMAwebBox_dataExtractor” works.
Every application used has at least linux versions for them (which might be slightly different) so having a read to the whole documentation can be handy for linux users as well. The only exception is the “scheduled tasks” application, linux users will need to work with the cron daemon. Also some applications are installed as services in windows, they should be daemons in Linux.
Elements of the system
In this sections we describe the applications we have used.
- MySql Server: the data displayed is stored in databases. MySql server is manages the acces to these databases.
- PhpMyAdmin: application that works as a graphical interface between an user and MySql Server
- SMA WebBox PushFTP function: the webBox works as a datalogger which collects the data from SMA devices. The PushFTP function can be configured in the webBox so that it uploads the data using the ftp protocol to a server.We don´t explain here how to configure it, it is really easy and also you can find it in the webBox user guide
- Filezilla Server: as the PushFtp function of the webBox starts the communication for uploading the data, we can say that the webBox is the client so we need to have a ftp server that works as a host.
- SMAwebBox_dataExtractor: application which checks if there are new files uploaded from the webBox. If this is the case, it unzips them, resulting in the xml files which contents the actual data. The application then extracts this data and insert it in the database.
- PHP: “SMAwebBox_dataExtractor” is written in PHP. This language is commonly used for writing code that inserted in web documents allow us to create dinamic html documents. Farther than that it is powerful tool for developing software non dependent in the operating system. The applications can easily be launched throw the PHP Command Line Interface (CLI) instead of using a web browser.
- Scheduled tasks: “Scheduled tasks” is the Windows application that allow us to run the SMAwebBox_dataExtractor.php every certain time.
The SMAwebBox_dataExtractor.php is organized as it follows:
- \SMAwebBox_dataExtractor: everything related to this application is there
- SMAwebBox_dataExtractor\pushFTP_files: where the webBox pushFTP function uploads the files
- \SMAwebBox_dataExtractor\zipFILES: to where the application moves the original files (from \pushFTP_files) once the data has been extracted
- \SMAwebBox_dataExtractor r\unextractedFILES: when the application has a problem when unzipping or inserting the data of file in the database, the file is moved to this directory
- SMAwebBox_dataExtractor\miscelanea: interesting infromation but not really needed. (for example a copy of this documentation)
- SMAwebBox_dataExtractor\log: where the log files are stored
There is table for each device and for the webBox log. The name of each table is the serial number of the device from which the data is stored or “log”.
PushFTP files format
The pushFTP function of the webBox uploads the files generated to a certain directory. This files are “.zip” files and its name contents: wbserialnumber (for example: wb150022199), the date and time when the file was generated and the file extension. An example: wb150022199.20100120-121948.zip
In this .zip file there are usually more than one file compressed. These files are also zip files and they content the xml file where the data is stored
XML files format
The data to be inserted in the database is originally in a xml file. The webBox generates two kind of xml files: one for the events that it logs and the other one for the actual data of the devices.
The name of the file contents: the word Log, the date and time when the file was generated and the file extension. An example: Log.20100111_214534.xml
The information in the file can be for more than one event.
These files can be defined as:
- Element Info
- Child elements: Created, Culture
- Element Event
- Child elements: Key, DateTime, EventType, AccessLevel, Category, Device, Module, MessageCode, MessageArgs, Message.
The useful data in this xml file are every child element of “Event”. All of them are inserted in the database
The name of the file contents: the word Mean, the date and time when the file was generated and the file extension. An example: Mean.20100111_220028.xml
The webBox logs the data periodically (every 5, 10 or 15 minutes), and everytime it generates a “Mean” file which contents the data of all the devices connected to the webBox.
These files can be defined as:
- Element “Info”
- Child elements: Created, Culture
- Element MeanPublic
- Child elements: Key, First, Last, Min, Max, Mean, Base, Period, TimeStamp.
The file contents a “MeanPublic” element for every parameter of every device connected to the webBox. In this way the “Key” child element represents the device and parameter name, while the “Mean” child element contents the data associated to that parameter. In other words, from the “Key” element we know which table and field in the database to insert the “Mean” element in. The rest of the child elements are not useful for us. For example: a device which serial number is 1234 will have a “Key” element for the “Pac” parameter which is “1234:Pac”.
For running the “SMAwebBox_dataExtractor.php” application we use the command line interface (CLI) of php, this launches the application. We use “php-win.exe” because it allow us to run the application without the DOS box. The simplest command for launching an application is:
php-win.exe -f application.php -- parameters
In this way the command we use for launching “SMAwebBox_dataExtractor.php” is:
php-win.exe" -f "C:\path\SMAwebBox_dataExtractor\SMAwebBox_dataExtractor_1.0.php"
There is a task programmed in the “Scheduled task” of windows that launch “SMAwebBox_dataExtractor” every minute.
This section refers to the log files that the “SMAwebBox_dataExtractor” generates and has nothing to do with the log files generated by the webBox. The application generates a new log file everyday
The following information is logged:
- The launch and end of the applicationT
- The connection to the database
- A file is moved, deleted or unzipped
- The data in an xml file has successfully been inserted in the database
- Any error on the previuos actions and some information that can be helpful
This sections gives a very brief description about how the main program and some of its functions work.
Basically, when the aplication is launched it opens the directory where the files has been uploaded by the webBox pushFTP function. It reads the directory from the begining. If the file it finds is a .zip file it is unzipped (to the same directory) and moved or deleted (depending if it is the original uploaded file), if the file is an .xml the data it contents is read and inserted into the database (then the file is deleted). Once something has been done with a file, the application reads the next file and when the end of the directory is reached, it starts again from the beginning until all the files has being moved or deleted. Mostly everything which is done by the application is written to the log file.
The code has been commented enough so that it might be easy to understand.
When connecting a new device to the webBox
If a new device is connected to the webBox, it will store its data and send it throw the pushFTP function. The “SMAwebBox_dataExtractor” application will try to insert this data into a table in the database.
In the database there should be a table for every device connected to the webBox. When the “SMAwebBox_dataExtractor” application tries to insert the data of the new device, MySQL server will generate an error if we haven´t created previously the table in the database and “SMAwebBox_dataExtractor” will move that file to the “\unextracted_files” directory.
Thus, it is very important that when we connect a new device to the webBox we create a new table for it in the database. The name of this table must have the next format: device_serialnumber. It must have a field for every parameter which is logged and, of course, the name of the parameter and the field must be the same.
For making it easy you can find some different php files that we have used to insert the tables of the original devices in the database. These files have enough comments to understand the changes you need to do for creating the new ones. These files can be found in:
It is highly recomended that you use original files from your webBox that it has previously uploaded to the ftp server.
This section aims to facilitate the process . Following these instructions will avoid problems as the different elements of the system need to share users, directory paths, etc.
This is a paccket that includes most of the tools we need to use. It is intended to be for setting up a web server based on Apache. When installing XAMP we are also installing Apache but it doesn´t necessarily mean that we have to run it. Anyway, we want to have it runnig so that we can use tools like “phpMyAdmin”
So search the internet for downloading it and install it in “C:\Program Files\”. It will automatically install and configure: Apache, MySQL Server, PHP, phpMyAdmin, Filezilla Server.
If you are reading this it means that you have downloaded and unzipped the “SMAwebBox_dataExtractor” package, please copy it into:
So that you have:
If, when you installed XAMP, you changed the path installation, you need to take it into account.
This path uses the root directory for the web pages served by Apache.
In this directory you must have at least the following directories: pushFTP_files, zipFILES, unextractedFILES and log.
Configuring Filezilla Server
It must be installed as a service, the XAMP installation wizard asks for doing it. Check that the service is configured for starting automatically when the computers starts, also to restart in case of failure (control pannel --> administrative tools --> services).
For security reasons set a password for accessing to the Filezilla server throw the "Filezilla Server Interface".
The Filezilla Server should be listening in the default port: 21
The user and folder that the pushFTP function of the webBox will use for connecting to Filezilla Server canl be (and must be set up in the Filezilla server and the webBox):
- User: webBox, password: userpassword
- Shared folders (for the given user): C:\Program Files\apachefriends\xampp\htdocs\SMAwebBox_dataExtractor\pushFTP_files
- Files rights: read and write
- Directories rights: list and subdirs
For security reasons you can set another user with all rights:
- User: admin, password: adminpassword
- Shared folder: C:\Program Files\apachefriends\xampp\htdocs\
webBox PushFTP function
Follow the webBox user manual for setting up this function. Use the same user, password and directories that when you set up the FileZilla server.
mySQL Server users
For security reasons after the installation of XAMPP we should set a password for the user “root”.
Then we have to create the user that the SMAwebBox_dataExtractor will use for accessing to the database:
- User: your_user
- Password: your_password
- Privileges: SELECT, INSERT, UPDATE, DELETE, CREATE, DROP
Both things can easily be done accessing throw mozilla to localhost and then going to phpMyAdmin
It can be useful to set up another user which can only “SELECT”.and can be used for browsing the contents of the database
“pma” is the user that phpMyAdmin uses for accessing to mySQL Server. For security reasosns, we should change its password throw PHPMyAdmin.
Then we need to indicate it in phpMyAdmin config file which is located in: C:\Program Files\apachefriends\xampp\phpMyAdmin\config.inc.php
Look for the line: $cfg['Servers'][$i]['controluser'] = 'pma'; and then underneath it make the following change:
$cfg['Servers'][$i]['controlpass'] = ''; --> $cfg['Servers'][$i]['controlpass'] = 'pma_password';
Then we also need to do the following changes:
$cfg['Servers'][$i]['extension'] = 'mysql'; --> $cfg['Servers'][$i]['extension'] = mysqli';
$cfg['Servers'][$i]['auth_type'] = 'config'; --> $cfg['Servers'][$i]['auth_type'] = 'http';
First of all create using “phpMyAdmin” the database for storing the data. Give it the name that you want.
Then, create one table for every device connected to the webBox. You can get some instructions about how to do it in the section “When connecting a new device to the webBox”
This is the configuration file for PHP. The file is in: C:\Program Files\apachefriends\xampp\apache\bin\php.ini
Check that next lines are in the file (it´s important to have or not have the semicolon in the beginning)
extension_dir = ".\ext"
Create a new “scheduled task”, the parameters must be set to:
Run: "C:\Archivos de programa\apachefriends\xampp\php\php-win.exe" -f "C:\Archivos de programa\apachefriends\xampp\htdocs\SMAwebBox_dataExtractor \push_ftp_extractor_1.0.php"
“Run when the session has started”
Set the task everyday
Repit the task: every 1 minute
How to configure SMAwebBox_dataExtractor.php
You can find the configuration file in
Open it and write the data indicated which is:
- DB_SERVER: where is the mysql server where the data is being stored, tipically "localhost"
- DB_NAME: the name you gave to the database
- DB_USER: and finally the user and password you set up in the section "mySQL Server users"
- DB_PASSWORD: the same than above
There are files in “\SMAwebBox_dataExtractor\unextracted_FILES”.What does it mean? Shall I do anything?
If there is any problem when unzipping or extracting the contents of the files uploaded from the webBox, they are automatically moved to this folder. Depending in the nature of the problem something might be done.
It is a good idea to check the log files for finding out why a file has been “unextracted”. So open the log file which has the same date than when the file was moved (see “last modified” in the explorer) to the “unextrated_FILES” folder. Then do a search for the name fo the file.
If you find out that there has been an error, we need to fix it. Once the problem has been solved we need to move the files from the “unextracted_FILES” directory to the “push_FTP” one so that the “SMAwebBox_dataExtractor” can try again to unzip or extract the file.
The log files says there is the following error: “ problems with the query: Duplicate entry “. What shall I do?
The database is configured so that there can´t be two registers with the same timestamp in a device´s tables. As the error says, there is already a register with that timestamp. In this situation “SMAwebBox_dataExtractor” will not carry on extracting the rest of the data in that file and will move it to the “unextracted_FILES” directory.
Probably what happened is: there was a previous problem when inserting some of the data of that file, you solved the problem and now you are trying to run the application on that file again. Realize that before the original problem happened you had already inserted some data from that file into the database and now you are trying to do it again (and that´s why you get the new error). Check that this is the reason of the error comparing the data that was going to be inserted (you can see it in the log file) with the database register that has the same timestamp. If this is the reason of the problem you can solve it deleting from the database the register which causes the problem allowing the application to insert it again.
The log files says there is the following error: “ problems with the query: Table '.......' doesn't exist“. What shall I do?
Have you connected a new device to the webBox but not created a new table for it in the database? Have look in the “When connecting a new device to the webBox” section.
There are lots of files in the “\pushFTP_files” directory and they don´t move from there. What´s happening?
Check that the “
SMAwebBox_dataExtractor” is being launched. For example you can check when today´s log file was last modified. If it was a long time ago (more than 5 minutes) something is going wrong. Check the “Scheduled tasks” section.
Then if that was not the problem, check the log file, if it says that the “file couldn´t be copied/deleted” several things can be happening:
The directory structure is corrupted (have a look in the “System description” section)
The user hasn´t got privileges for writing in that directory. You need to change them.
Why there is a file called “info.xml” in the “unextracted_files” directory?
Sometimes the webBox uploads this file. It contents some information about the webBox (as its serial number, MAC address, etc.). It isn´t a zip file, neither a “Mean” or “Log” file (which are the ones whith data to be inserted in the database) so “SMAwebBox_dataExtractor” moves it to this folder. It requies no action from us, and it can be deleted.
The Hard Disk is becoming full. What can I do?
You can make a copy of the “zipFILES” directory and empty it. The files in here are kept as redundancy, they are the original files uploaded from the webBox
How do I check that everything is going allright?
First, for checking that the “SMAwebBox_dataExtractor” application is being launched you can check when was modified today´s log file. It is supposed that the aplication is running every minute. So it should have been modified not more that one minute ago.
Then, you can check that there isn´t any file in “unextracted_Files” directory. If there is anything in there, something has happened.
It´s better that you check the log files looking for error (just search the word “error”) and refer to the “troubleshooting” section in that case.
For checking that the webBox is uploading files have a look in the “pushFTP_files” directory, there might be some files there. Take into account that the “SMAwebBox_dataExtractor” moves the files to the “zipFILES” directory once they have been extracted, so you can also have a look in here. Check when the newest file were “last modified” so that you can have an idea about when the last files were uploaded
Who can I talk with if I have a question?
This big mess was developed by Carlos Alonso Gabizó, you can contact me on firstname.lastname@example.org