Tag Archives: influxdb

Simulating a Tesla Powerwall with InfluxDB

Tesla Powerwall

It’s a little over a year ago that I got solar panels installed on my roof. Not a Tesla Solar Roof, but the most efficient solar panels available last year. From SunPower. The SPR-X21-350-BLK. Together with a SolarEdge SE5K converter its producing 5300 Watt during peak hours.

I’m interested if a Tesla Powerwall is worth it financially. Will it pay back the investment in a reasonable amount of time?

In the Netherlands there is still a net metering policy until 2023, which means that you can administratively subtract the electricity you have delivered to, from what you have used from the electricity network. Basically you can use the electricity network as a big battery to store solar overproduction.

Starting in 2023 the net metering policy will be phased out gradually. Based on the current pricing you pay around €0.25 per kWh, but for delivering you only get between €0.04 and €0.12 per kWh depending on the contract with your supplier. It becomes financially more attractive to use as much electricity as possible from your solar panels directly or via a battery.

Preparing for the year 2023 I have some research questions after one year of having solar panels:

  • How much electricity is actually used directly from my solar panels?
  • Is it worth investing in a Tesla Powerwall to use more of my own solar power?

Collecting the data

Since the start I’ve been collecting metrics from the Solaredge inverter using a wrapper script around sunspec-monitor. The main metric I’m collecting from the inverter is the “energy_total” metric, which is the total amount of Watt hours produced by the inverter. This metric is collected every 60 seconds and stored in InfluxDB.

Using a USB cable connected to the P1 port of my electricity meter I’m also collecting metrics about the amount of electricity consumed from and delivered to the electricity network every 60 seconds.

Overview of metrics collection from Inverter and Electricity Meter

In InfluxDB the stored metrics look like this:

> select net_used, delivered, produced from "electricity" order by time desc limit 10;
name: electricity
time                 net_used delivered produced
----                 -------- --------- --------
2020-05-07T07:32:00Z 3207201  5214424   6368160
2020-05-07T07:31:00Z 3207201  5214367   6368100
2020-05-07T07:30:00Z 3207201  5214310   6368039
2020-05-07T07:29:00Z 3207201  5214253   6367978
2020-05-07T07:28:00Z 3207201  5214196   6367918
2020-05-07T07:27:00Z 3207201  5214140   6367858
2020-05-07T07:26:00Z 3207201  5214084   6367798
2020-05-07T07:25:00Z 3207201  5214028   6367739
2020-05-07T07:24:00Z 3207201  5213972   6367679
2020-05-07T07:23:00Z 3207201  5213917   6367620

When turning this data into a graph, on a sunny day it look like this:

Electricity Usage, Delivery, Production on a sunny day (20200404)

On a cloudy day it looks like this:

Electricity Usage, Delivery, Production on a sunny day (20200414)

As you can see, I’m not using all the electricity produced by the solar panels directly. Most of it actually is delivered to the electricity network. And during the night, there of course is no sun, so I’m using from the electricity network.

Statistics from Year 1

Looking at some day graphs is nice, but what does this mean overall in a year?

  1. How much electricity did the solar panels produce?
  2. How much electricity was delivered to/used from the network?
  3. How much electricity was consumed from the solar panels directly?
  4. How much electricity did I consume in total?

The answers to the 1st two questions is pretty easy to find. Take the number from the 3 columns in InfluxDB from today and subtract the values from 1 year ago.

The answer to question 3 can be answered by subtracting “delivery” from “production”. Its also possible to have per minute statistics for direct consumption by creating a “Continuous Query” in InfluxDB:

CREATE CONTINUOUS QUERY cq_consumed ON energy
BEGIN
SELECT mean(produced) - mean(delivered) AS consumed
INTO energy.autogen.electricity
FROM energy.autogen.electricity
GROUP BY time(1m), *
END

The Continuous Query will only generate new “consumed” values. To generate previous values, execute the part between BEGIN and END once.

For the answer to question 4, we need to sum “consumed” (generated by the previous Continuous Query) and “net_used”:

CREATE CONTINUOUS QUERY cq_total_used ON energy
BEGIN
SELECT mean(consumed) + mean(net_used) AS total_used
INTO energy.autogen.electricity
FROM energy.autogen.electricity
GROUP BY time(1m), *
END

In InfluxDB it now looks like this (first and last metric of year 1):

> select net_used, delivered, produced, consumed, total_used from "electricity" where time < '2019-04-21' order by time desc limit 1;
name: electricity
time                 net_used delivered produced consumed total_used
----                 -------- --------- -------- -------- ----------
2019-04-20T23:59:00Z 337620   1224      373      -851     336769

> select net_used, delivered, produced, consumed, total_used from "electricity" where time < '2020-04-21' order by time desc limit 1;
name: electricity
time                 net_used delivered produced consumed total_used
----                 -------- --------- -------- -------- ----------
2020-04-20T23:59:00Z 3117433  4841893   5919356  1077463  4194896

Putting the results in a Grafana dashboard gives an interesting overview of year 1:

Overview of electricity production, usage and delivery (ignore the rounding errors)

The answers to my questions:

  1. How much electricity did the solar panels produce?
    • 5917kWh produced
  2. How much electricity was delivered to/used from the network?
    • 4841kWh delivered to the network
    • 2779kWh used from the network
  3. How much electricity was consumed from the solar panels directly?
    • 1075kWh consumed directly
  4. How much electricity did I consume in total?
    • 3853kWh used in total

The solar panels produced 54% more then I consumed in total, but still I need to get 72% of my electricity from the electricity network. The average daily production/usage graph below shows why. Most of the electricity is consumed when the sun is not shining. 😕

Daily average electricity production / usage

What does this mean in terms of yearly costs/profit, taking €0.25 per kWh for usage and €0.11 per kWh for delivery. Without having solar panels my cost would have been (3853 x 0.25) = €963.25

  • With net metering: (4841-2779) x 0.11 = €226.82 profit
  • Without net metering: 694.75 – 532.51 = €162.24 costs
    • 2779 x 0.25 = €694.75 costs
    • 4841 x 0.11 = €532.51 profit

What if I only would get €0.04 per kWh for delivery?

  • With net metering: (4841-2779) x 0.04 = €82.48 profit
  • Without net metering: 694.75 – 193.64 = €501.11 costs
    • 2779 x 0.25 = €694.75 costs
    • 4841 x 0.04 = €193.64 profit

Conclusions from Year 1

  • Only 18% of the solar energy is directly consumed
  • About 54% more electricity is produced then actually needed
  • Still 72% of the electricity comes from the electricity network, because there is no(t enough) solar energy available when needed
  • Currently there is a annual profit of €227.04
  • Without net metering that would been €162.24 profit
  • Future worse case (€0.04 without net metering after 2023): €501.11 costs 🙁

Simulating the Tesla Powerwall

The Tesla Powerwall. There are many interesting things to write about it, but let’s keep it simple and focused. Some specs:

  • 14kWh of electricity can be stored
  • 13.5kWh of this is usable (completely discharging would be bad for the battery)
  • Of the electricity you put in, 90% you will get out of it (Round Trip Efficiency)
  • Current price: € 8240

With all the data in InfluxDB and the specs above, it is possible to simulate a Powerwall minute by minute. I’ve written some python code that does this. In the simulation the battery has a minimum threshold of 500 Wh, a maximum of 14000 Wh and it takes the Round Trip Efficiency of 90% into account by dividing by 0.9 when electricity is used from the battery.

#!/usr/bin/env python3
from influxdb import InfluxDBClient

influx_client = InfluxDBClient('localhost', 8086, 'username', 'password', 'database')


def influx(measurements):
  try:
    influx_client.write_points(measurements)
  except Exception as e:
    print('Failed to write to influxdb: ', e)


def charge(battery, delivered, net_used):
    if net_used > 0:
        if battery['level'] - (net_used / battery['efficiency']) < battery['min']:
            remaining_battery = battery['level'] - battery['min']
            battery['level'] = battery['min']
            battery['net_usage'] = battery['net_usage'] + (net_used - (remaining_battery * battery['efficiency']))
        else:
            battery['level'] = battery['level'] - (net_used / battery['efficiency'])

    if delivered > 0:
        if battery['level'] + delivered > battery['max']:
            remaining_battery = battery['max'] - battery['level']
            battery['level'] = battery['max']
            battery['net_delivery'] = battery['net_delivery'] + (delivered - remaining_battery)
        else:
            battery['level'] = battery['level'] + delivered

    return battery


def main():
    prev_point = None
    measurements = []
    starttime = '2019-01-01'
    battery = {
        "level": 0,
        "net_usage": 0,   
        "net_delivery": 0,
        "min": 500,  
        "max": 14000,
        "efficiency": 0.9,
    }

    result = influx_client.query("""select delivered, net_used from "autogen"."electricity" where time >= '{}' order by time;""".format(starttime))

    points = result.get_points()
    for point in points:
        if prev_point is not None:
            for key in ['delivered', 'net_used']:
                if point[key] is None:
                    point[key] = prev_point[key]
            delivered = int(point['delivered']) - int(prev_point['delivered'])
            net_used = int(point['net_used']) - int(prev_point['net_used'])
            battery = charge(battery, delivered, net_used)
            measurements.append({
              "measurement": "battery",
              "time": point['time'],
              "fields": {
                  "powerwall_level": int(battery['level']),
                  "powerwall_net_usage": int(battery['net_usage']),
                  "powerwall_net_delivery": int(battery['net_delivery']),
              },
            })
        prev_point = point
        if len(measurements) > 1000:
            influx(measurements)
            print(".")
            measurements = []
    influx(measurements)


if __name__ == "__main__":
    main()

A part of the result is shown below in a graph with the simulation of 3 days. To compare, I’ve also created a graph without the Tesla Powerwall simulation.

Tesla Powerwall Simulation: 3 days of electricity usage, delivery and Tesla Powerwall electricity level
3 day electricity usage, delivery and production without a Tesla Powerwall

With a Tesla Powerwall less electricity is used from the network (Net Usage (blue)). During a sunny day (day 2) the Powerwall is completely charged. The day after electricity is still being used from the Powerwall that was produced the day before. That is pretty cool! 🙂

What would this have meant when I would have had a Powerwall in the past year? These statistics can be collected the same way the “Statistics for Year 1” were collected.

> select * from "battery" where time < '2019-04-22' order by time asc limit 1;
name: battery
time                 powerwall_level powerwall_net_delivery powerwall_net_usage
----                 --------------- ---------------------- -------------------
2019-04-21T00:01:00Z 500             0                      453

> select * from "battery" where time < '2020-04-21' order by time desc limit 1;
name: battery
time                 powerwall_level powerwall_net_delivery powerwall_net_usage
----                 --------------- ---------------------- -------------------
2020-04-20T23:59:00Z 10216           2876167                1020952

Querying InfluxDB for this data shows that with a Powerwall, 2877kWh would have been delivered to the electricity network and 1021kWh would have still been used from the network. This Grafana dashboard gives a clear overview:

Without a Powerwall I needed to get 72% of my electricity from the network. This is now reduced to 26%. 31% of the solar production gets stored in the Powerwall for later use.

Why do I still need to get 26% of the electricity from the network?

Daily Electricity Network Usage / Delivery when simulating a Tesla Powerwall

The graph above shows why. In the winter there is just not enough solar production to cover my needs. The graph below shows the same data, but without a Powerwall.

Daily Electricity Network Usage / Delivery (without a Tesla Powerwall)

Back to the simulation. What does it mean in terms of costs/profit, again taking €0.25 per kWh for usage and €0.11 per kWh for delivery.

  • With net metering: (2877-1021) x 0.11 = €204.16 profit
  • Without net metering: 316.47 – 255.25 = €61.22 profit
    • 1021 x 0.25 = €255.25 costs
    • 2877 x 0.11 = €316.47 profit

Or what if I only would get €0.04 per kWh for delivery?

  • With net metering: (2877-1021) x 0.04 = €82.48 profit
  • Without net metering: 115.08 – 255.25 = €140.17 costs
    • 1021 x 0.25 = €255.25 costs
    • 2877 x 0.04 = €115.08 profit

Conclusions from Simulating a Tesla Powerwall

  • I would make €204.16 profit instead of €226.82 currently, with net metering. This is actually a decrease in profit because energy gets lost because of the round trip efficiency of the Powerwall.
  • Worst case in the example scenario described above (€0.04 per kWh for delivery), without net metering I would have €140.17 energy costs with a Powerwall and €501.11 costs without. Here it starts to work out.
  • With a yearly cost saving of €360.94 (501.11-140.17) it would take around 23 years to make a Tesla Powerwall profitable in my current situation.

So is it worth investing in a Tesla Powerwall to use more of my own solar power?

It depends on your investment horizon. But for me it’s a “No”. 23 years is a bit too much. Taking into account that the Tesla Powerwall 2 only has a warranty period of 10 years. Besides that the battery quality will get worse over time and the storage capacity of the Powerwall will decrease.

Other considerations

Timing of Heating Hot-water Storage Tank

You have to get it while its hot, right? Definitely with solar. To prepare for the year 2023, you should use as much electricity from your solar panels directly as possible. Also if you have a Tesla Powerwall.

I’m not going to cook earlier during the day. And the low-temperature heating mostly happens in the winter during the night to keep the in-house temperature stable, which is supposed to be efficient already. But what could be done is heating the hot-water storage tank during the day, when there is solar power available. This should decrease the amount of electricity delivered to and used from the electricity network.

LG Chem RESU

You don’t have to buy a Telsa Powerwall. There are many alternatives, like a battery from the LG Chem RESU series. I’ve done the same calculations as with the Powerwall, without net metering, €0.04 per kWh for delivery. With a Return on Investment of 12 to 14 years, this seems to be more interesting then the Powerwall.

Net deliveryNet UsageStored kwhInvestmentSavings / YearROI
No battery4841 (82%)2779 (72%)0 (0%)€ 0€ -501.11
Powerwall2877 (49%)1021 (26%)1965 (33%)€ 8240€ 360.9422.8 years
RESU 132958 (50%)1009 (26%)1884 (32%)€ 6853€ 367.1818.7 years
RESU 103077 (52%)1118 (29%)1765 (30%)€ 4961€ 344.6914.4 years
RESU 6.63302 (56%)1327 (34%)1540 (26%)€ 3812€ 301.4412.6 years
RESU 3.33870 (65%)1861 (48%)972 (16%)€ 2602€ 190.6613.6 years

Server stats with collectd, InfluxDB and Grafana (with downsampling)

Almost 10 years ago I started developing a web frontend for collectd, Collectd Graph Panel (CGP). A PHP frontend that displays graphs in PNG format using rrdtool and the RRD files created by collectd.

A lot has happened since then. Because of the IoT hype time series databases like Graphite, InfluxDB and TimescaleDB became more popular. Also visualization tools gained more traction, of which Grafana is the most popular one.

In this blogpost I’m going to show a replacement of collectd, RRD files and CGP, by using collectd, InfluxDB and Grafana. I will:

  1. Hook up collectd to InfluxDB to store the metrics
  2. Configure InfluxDB to aggregate data over time (it doesn’t do this automatically like RRD)
  3. Use a Grafana dashboard to display the graphs with the same colors and styling I was used to in CGP

Hooking up collectd to InfluxDB

This is pretty simple. First of all follow the installation guide to install the InfluxDB service.

InfluxDB supports the collectd protocol. It can be configured to listen on UDP port 25826, which collectd clients can send metrics to.

I more or less used the default values that were already provided in /etc/influxdb/influxdb.conf:

[[collectd]]   
  enabled = true
  bind-address = ":25826"
  database = "collectd"
  retention-policy = ""
  typesdb = "/usr/share/collectd/types.db"
  security-level = "none"
  batch-size = 5000
  batch-pending = 10  
  batch-timeout = "10s"
  read-buffer = 0

In the configuration of the collectd clients, InfluxDB can be configured as server in the network plugin:

LoadPlugin network
<Plugin network> 
  Server "<InfluxDB-IP-address>" "25826"
</Plugin>

The metrics the collectd clients collect are now send to InfluxDB.

Downsampling data in InfluxDB

Unlike with the RRD files created by collectd, InfluxDB doesn’t come with a default downsampling policy. Metrics are just send by the collectd clients every 10 seconds and saved in InfluxDB and kept indefinitely. You will have super detailed graphs when you for example zoom in on some hourly statistics from 5 months ago, but your InfluxDB data-set will keep growing resulting in gigabytes of data per collectd client.

In my experience for server statistics you want to have detailed graphs for the most recent metrics. This is useful when you want to debug an issue. Older metrics are nice to display weekly, monthly, quarterly or yearly graphs to spot trends. For graphs with these timeframes 10 second metrics are not required. Metrics for these graphs can be aggregated.

In InfluxDB the combination of “Retention Policies” (RPs) and “Continuous Queries” (CQs) can be used to downsample the metrics. One of the things you can define with an RP is for how long InfluxDB keeps the data. CQs automatically and periodically execute pre-defined queries. This can be used to aggregate the metrics to a different RP.

I’ve been fairly happy with the aggregation policy in the RRD files used by collectd. Let’s try to setup the same data aggregation system in InfluxDB.

Information about the aggregation policy can be extracted from the RRD file by using the rrdinfo command. Let’s take for example the cpu-idle.rrd file. This shows that this RRD file contains 1 metric per 10 seconds:

$ rrdinfo cpu-idle.rrd | grep step
step = 10

And this shows the different aggregation policies for the average value of the metrics:

$ rrdinfo cpu-idle.rrd | grep AVERAGE -A6 | egrep '(rows|pdp_per_row)'
rra[0].rows = 1200
rra[0].pdp_per_row = 1
rra[3].rows = 1235
rra[3].pdp_per_row = 7
rra[6].rows = 1210   
rra[6].pdp_per_row = 50
rra[9].rows = 1202
rra[9].pdp_per_row = 223
rra[12].rows = 1201
rra[12].pdp_per_row = 2635

There are 5 different aggregations. They all have Primary Data Points per row (pdp_per_row), which means that for example 1 row (metric) is an aggregation of 7 Primary Data Points. And it shows the number of rows that are kept.

Summarized this RRD file contains:

  • 1200 metrics of a 10 second interval (12000s of data == 3.33 hours)
  • 1235 metrics of a (7*10) 70 second interval (86450s of data =~ 1 day)
  • 1210 metrics of a (50*10) 500 second interval (605000s of data == 1 week)
  • 1202 metrics of a (223*10) 2230 second interval (2680460s of data == 31 days)
  • 1201 metrics of a (2635*10) 26350 second interval (31646350s of data == 366 days)

Let’s connect to our influxdb instance and configure the same using RPs and CQs.

$ influx
Connected to http://localhost:8086 version 1.7.6
InfluxDB shell version: 1.7.6
Enter an InfluxQL query
> show databases
name: databases
name
----
_internal
collect
> use collectd
Using database collectd
> show retention policies
name    duration  shardGroupDuration replicaN default
----    --------  ------------------ -------- -------
autogen 0s        168h0m0s           1        true

The database by default contains the “autogen” RP, with a duration of 0s. No data will be thrown away. First modify the duration of the autogen retention policy to 200 minutes:

> alter retention policy "autogen" on "collectd" duration 200m shard duration 1h
> show retention policies
name    duration  shardGroupDuration replicaN default
----    --------  ------------------ -------- -------
autogen 3h20m0s   1h0m0s             1        true  

Now add the additional RPs:

> CREATE RETENTION POLICY "day" ON collectd DURATION 1d REPLICATION 1
> CREATE RETENTION POLICY "week" ON collectd DURATION 7d REPLICATION 1
> CREATE RETENTION POLICY "month" ON collectd DURATION 31d REPLICATION 1
> CREATE RETENTION POLICY "year" ON collectd DURATION 366d REPLICATION 1
> show retention policies
name    duration  shardGroupDuration replicaN default
----    --------  ------------------ -------- -------
autogen 3h20m0s   1h0m0s             1        true  
day     24h0m0s   1h0m0s             1        false
week    168h0m0s  24h0m0s            1        false
month   744h0m0s  24h0m0s            1        false
year    8784h0m0s 168h0m0s           1        false

For downsampling in InfluxDB I want to use more logical durations compared to what was in the RRD file:

  • 70s -> 60 seconds
  • 500s -> 300 seconds (5 minutes)
  • 2230s -> 1800 seconds (30 minutes)
  • 26350s -> 21600 seconds (6 hours)

These CQs will downsample the data accordingly:

> CREATE CONTINUOUS QUERY "cq_day" ON "collectd" BEGIN SELECT mean(value) as value INTO "collectd"."day".:MEASUREMENT FROM /.*/ GROUP BY time(60s),* END
> CREATE CONTINUOUS QUERY "cq_week" ON "collectd" BEGIN SELECT mean(value) as value INTO "collectd"."week".:MEASUREMENT FROM /.*/ GROUP BY time(300s),* END
> CREATE CONTINUOUS QUERY "cq_month" ON "collectd" BEGIN SELECT mean(value) as value INTO "collectd"."month".:MEASUREMENT FROM /.*/ GROUP BY time(1800s),* END
> CREATE CONTINUOUS QUERY "cq_year" ON "collectd" BEGIN SELECT mean(value) as value INTO "collectd"."year".:MEASUREMENT FROM /.*/ GROUP BY time(21600s),* END

With these CQs and RPs configured you will get 5 data streams: autogen (the default), day, week, month and year. To retrieve the aggregated metrics from a specific RP you have to prefix the measurement in your select query with it. So for example to get the cpu idle metrics you can execute this to get the metrics in the 10s resolution:

> select * from "cpu_value"
# or
> select * from "autogen"."cpu_value"

To get it in 60s resolution (RP “day”):

> select * from "day"."cpu_value"

This is important to know when creating graphs in Grafana. When you want to show a “month” or “year” graph you can not simply do select value from "cpu_value" where type_instance='idle', because you will only get the metrics from the “autogen” RP. You have to explicitly define the RP.

Collectd graphs in Grafana

To install Grafana follow the installation guide.

Create a user in InfluxDB that can be used in Grafana to read data from InfluxDB:

> create user grafana with password <PASSWORD>
> grant read on collectd to grafana

To get access to the collectd data in InfluxDB you need to configure a data source in Grafana:

Configure CollectD data source.

Now let’s for example create a graph for the load average.

Select Retention Policy in query

As you can see you have to explicitly select the RP for the metrics you want to display in the graph. There is no easy way to get metrics automatically from all RPs at once. This is of course not really convenient, because once the graph on your dashboard is configured you want to be able to change the time range and just see the data from whatever RP that has the metrics in the most detailed way. So ideally you want the RP to be automatically selected based on the time range that is selected.

There are luckily more people having this issue and Talek found a nice workaround for it.

We can create a variable that executes a query based on the current “From” and “To” time range values in Grafana to find out what the correct RP is. This variable can be refreshed every time the time range changes. The query to find out the correct RP is executed on measurement “rp_config” that has a separate RP (forever) without a duration so this data never gets deleted.

Configure the extra RP and insert the RP data:

CREATE RETENTION POLICY "forever" ON "collectd" DURATION INF REPLICATION 1
INSERT INTO forever rp_config,idx=1 rp="autogen",start=0i,end=12000000i,interval="10s" -9223372036854775806
INSERT INTO forever rp_config,idx=2 rp="day",start=12000000i,end=86401000i,interval="60s" -9223372036854775806
INSERT INTO forever rp_config,idx=3 rp="week",start=86401000i,end=604801000i,interval="300s" -9223372036854775806
INSERT INTO forever rp_config,idx=4 rp="month",start=604801000i,end=2678401000i,interval="1800s" -9223372036854775806
INSERT INTO forever rp_config,idx=5 rp="year",start=2678401000i,end=31622401000i,interval="21600s" -9223372036854775806

In the start and end times I added one extra second (86400000i -> 86401000i) because I noticed when for example selecting the “Last 24 hours” range in Grafana, $__to$__from never was exactly 86400000 milliseconds.

Create the variable in Grafana:

Create $rp variable in Grafana

And use the $rp variable as RP in the queries to create the graph:

Configure $rp in query

There is one caveat with this solution. It only works when the end of the time range is now (current time), for example by selecting a “Quick range” that starts with “Last …”. The query only looks at how long the time range is. Not if the RP contains the full time range. I’ve not been able to achieve this by using the available variables in Grafana like $__from, $__to and $__timeFilter and the possibilities that InfluxQL has. I’ve tried to adjust the query to do something like select rp from rp_config where $__from > now() - "end", but that is not supported by InfluxDB and returns an empty result.

The effect of the caveat is that when you zoom in on older metrics, the $rp variable will select an RP that does not contain the data anymore. When changing the $rp variable manually you can see that less detailed metrics are available in different RPs. For example:

GIF of different retention policies

Result: Less storage required

I monitor 6 systems with collectd in my small home-setup. After configuring the collectd clients to send the metrics to InfluxDB and running this setup without RPs and CQs for a couple of weeks it already required 6 gigabyte of storage. After configuring the RPs and CQs the collectd InfluxDB now uses 72 MB. The RRD files in my previous setup used ~186 MB for these 6 systems.

Free space (var-lib-influxdb)

Grafana Dashboard available

To make things easy I’ve already created a dashboard that uses the same colors and styling as Collectd Graph Panel. It can be downloaded here: https://grafana.com/dashboards/10179

Grafana: CollectD Graph Panel

Measuring Power Consumption with Broadlink SP3S, python, influxdb and grafana

A while ago I was researching the possibilities to measure the power consumption of some devices in my house via Wifi. I came across the Broadlink SP3S Smart Plug. It met my requirements: relatively cheap, power measurement and Wifi. It comes with an IOS and Android App. There a big chance the app is not directly connecting to the SP3S, but to “the Cloud” where the SP3S sends its data to. This is how most companies design their products nowadays. I wasn’t really looking forward to share my power consumption data with Broadlink in “the Cloud”. With the App you can also turn the power on/off, which scares me a little bit. The Broadlink Cloud controlling this power switch. Nah, not for me.

I will explain how I installed the Broadlink SP3S without it making a connection to the internet and show how I use a python script to read the power meter data from the SP3S, store it to InfluxDB and use Grafana to display the collected data in a graph.

Note: When you want to buy a Broadlink SP3S, please make sure you buy the SP3S and not the SP3, which only is a power switch, not a power meter.

Install the Broadlink SP3S

In the step-by-step instructions below I will configure the SP3S to connect to my Wifi so I can connect to it from my local network to retrieve the power meter data. I use a laptop running Linux to connect to initially connect to the SP3S to configure it. I also run a Debian Linux machine as router to control the firewall between the local network and the internet.

  • Plug the SP3S in a wall socket
  • Press the On/Off button for 6 seconds to reset the SP3S. The power button starts blinking rapidly.
  • Press the On/Off button another 6 seconds to enable the Wifi Access Point on the SP3S. The power button blinks rapidly with pauzes.
  • Connect to the Wifi Access Point, it should be called “BroadlinkProv”
  • Look up the MAC address of the SP3S
$ ip neigh
192.168.10.1 dev wlp3s0 lladdr 34:ea:34:79:7b:ff REACHABLE
  • Block the MAC address to access the Internet in the router (I’m using a Debian Linux machine as router). It is important to block the MAC address before connecting the SP3S to your Wifi network so that it will never be able to access the internet.
$ iptables -A FORWARD -m mac --mac-source 34:ea:34:79:7b:ff -j DROP
$ ip6tables -A FORWARD -m mac --mac-source 34:ea:34:79:7b:ff -j DROP
$ git clone https://github.com/mjg59/python-broadlink
$ cd python-broadlink
$ python3 -m venv venv
$ . venv/bin/activate
$ pip3 install pyaes
$ mkdir lib
$ ln -s broadlink lib/broadlink
$ python3
Python 3.5.3 (default, Sep 27 2018, 17:25:39)
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import broadlink
>>> broadlink.setup('myssid', 'mynetworkpass', 3)
  • Now you will get disconnected from the SP3S Wifi Access Point. The SP3S will connect to the Wifi network configured above

When this firewall rule is added to the router as well, you will see that the SP3S immediately tries to connect to the internet.

$ iptables -I FORWARD -m mac --mac-source 34:ea:34:79:7b:ff -j LOG --log-level debug --log-prefix "Broadlink: "

$ tail /var/log/syslog
Broadlink: IN=eth1 OUT=eth0 MAC=e0:69:95:73:10:bf:34:ea:34:79:7b:ff:08:00 SRC=10.0.0.199 DST=13.231.11.213 LEN=76 TOS=0x00 PREC=0x00 TTL=63 ID=258 PROTO=UDP SPT=16404 DPT=16384 LEN=56
Broadlink: IN=eth1 OUT=eth0 MAC=e0:69:95:73:10:bf:34:ea:34:79:7b:ff:08:00 SRC=10.0.0.199 DST=13.231.11.213 LEN=76 TOS=0x00 PREC=0x00 TTL=63 ID=259 PROTO=UDP SPT=16404 DPT=1812 LEN=56
Broadlink: IN=eth1 OUT=eth0 MAC=e0:69:95:73:10:bf:34:ea:34:79:7b:ff:08:00 SRC=10.0.0.199 DST=13.231.11.213 LEN=76 TOS=0x00 PREC=0x00 TTL=63 ID=260 PROTO=UDP SPT=16404 DPT=8080 LEN=56
Broadlink: IN=eth1 OUT=eth0 MAC=e0:69:95:73:10:bf:34:ea:34:79:7b:ff:08:00 SRC=10.0.0.199 DST=13.231.11.213 LEN=76 TOS=0x00 PREC=0x00 TTL=63 ID=261 PROTO=UDP SPT=16404 DPT=80 LEN=56
Broadlink: IN=eth1 OUT=eth0 MAC=e0:69:95:73:10:bf:34:ea:34:79:7b:ff:08:00 SRC=10.0.0.199 DST=13.231.11.213 LEN=76 TOS=0x00 PREC=0x00 TTL=63 ID=262 PROTO=UDP SPT=16404 DPT=8090 LEN=56
Broadlink: IN=eth1 OUT=eth0 MAC=e0:69:95:73:10:bf:34:ea:34:79:7b:ff:08:00 SRC=10.0.0.199 DST=54.238.198.224 LEN=76 TOS=0x00 PREC=0x00 TTL=63 ID=263 PROTO=UDP SPT=16404 DPT=16384 LEN=56
Broadlink: IN=eth1 OUT=eth0 MAC=e0:69:95:73:10:bf:34:ea:34:79:7b:ff:08:00 SRC=10.0.0.199 DST=54.238.198.224 LEN=76 TOS=0x00 PREC=0x00 TTL=63 ID=264 PROTO=UDP SPT=16404 DPT=1812 LEN=56
Broadlink: IN=eth1 OUT=eth0 MAC=e0:69:95:73:10:bf:34:ea:34:79:7b:ff:08:00 SRC=10.0.0.199 DST=54.238.198.224 LEN=76 TOS=0x00 PREC=0x00 TTL=63 ID=265 PROTO=UDP SPT=16404 DPT=8080 LEN=56
Broadlink: IN=eth1 OUT=eth0 MAC=e0:69:95:73:10:bf:34:ea:34:79:7b:ff:08:00 SRC=10.0.0.199 DST=54.238.198.224 LEN=76 TOS=0x00 PREC=0x00 TTL=63 ID=266 PROTO=UDP SPT=16404 DPT=80 LEN=56
Broadlink: IN=eth1 OUT=eth0 MAC=e0:69:95:73:10:bf:34:ea:34:79:7b:ff:08:00 SRC=10.0.0.199 DST=54.238.198.224 LEN=76 TOS=0x00 PREC=0x00 TTL=63 ID=267 PROTO=UDP SPT=16404 DPT=8090 LEN=56

Let’s try to find out what the destination IP addresses are by using tcpdump.

$ tcpdump -ni eth1 host 10.0.0.199 and port 53
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
12:38:22.460717 IP 10.0.0.199.1169 > 10.0.0.7.53: 0+ A? 38010main.broadlink.com.cn. (44)
12:38:22.460870 IP 10.0.0.7.53 > 10.0.0.199.1169: 0 1/0/0 A 13.231.11.213 (60)
12:38:38.480835 IP 10.0.0.199.1171 > 10.0.0.7.53: 0+ A? 38010backup.broadlink.com.cn. (46)
12:38:38.480962 IP 10.0.0.7.53 > 10.0.0.199.1171: 0 1/0/0 A 54.238.198.224 (62)

So the SP3S immediately tries to contact 38010main.broadlink.com.cn. (13.231.11.213) and 38010backup.broadlink.com.cn. (54.238.198.224) on ports 16384, 1812, 8080, 80 and 8090 once it has a network connection. The iptables DROP rules in my router block this traffic. 🙂

Using broadlink_cli to retrieve meter data

Using “broadlink_cli” from python-broadlink the current energy consumption can be retrieved from the SP3S. To make “broadlink_cli” work, some things need to be modified when using the cloned git repository as python library.

Create a symlink to “broadlink_cli”:

$ ln -s cli/broadlink_cli

Edit “broadlink_cli” and change this:

import broadlink
import sys

To:

import sys
sys.path.insert(0, './')
import broadlink

Retrieve the current usage from the SP3S using “broadlink_cli”:

$ ./broadlink_cli --type 0x947a --host 10.0.0.199 --mac 34ea34797bff --energy
0.0

Turn on the power:

$ ./broadlink_cli --type 0x947a --host 10.0.0.199 --mac 34ea34797bff --turnon
== Turned * ON * ==

Store the power meter data to InfluxDB

The python script below reads the power consumption every 30 seconds from the SP3S and stores it to InfluxDB.

#!/usr/bin/env python3
import sys
import time
import datetime
from influxdb import InfluxDBClient

sys.path.insert(0, './')
import broadlink

name = '<NAME>' # What is the SP3S connected to?
type = int(0x947a) # https://github.com/mjg59/python-broadlink/blob/master/broadlink/__init__.py#L25
host = '10.0.0.199'
mac = bytearray.fromhex('34ea34796e9c') # The MAC address of the SP3S (without colons!)

dev = broadlink.gendevice(type, (host, 80), mac)
influx_client = InfluxDBClient('<INFLUXDB_HOSTNAME>', 8086, '<USERNAME>', '<PASSWORD>', '<DATABASE>')

def get_data():
    dev.auth()
    return dev.get_energy()

def influx(value):
    if value is None:
        return
    json_body = [
        {
            "measurement": name,
            "time": datetime.datetime.now(datetime.timezone.utc).replace(microsecond=0).isoformat(),
            "fields": {
                "usage": float(value),
            }
        }
    ]   
    try:
        influx_client.write_points(json_body)
    except:
        print('Failed to write to influxdb')

while True:
    try:
        influx(get_data())
    except Exception as err:
        print('Error: %s' % str(err))
    time.sleep(30)

Graphing the result in Grafana

In grafana use this configuration for the graph. Replace <NAME> with the name that is in the script.

Some interesting results

Measuring the power usage of several devices gives interesting insight in what a device is actually doing power-wise. Some examples are below.

The washer consumes around 2200 Watt at the beginning of a ~1:45h, 40°C program. And at the end about 500 Watt to centrifuge to dry the clothes a little bit.
The washer consumes 2200 Watt a bit longer in case of a ~1:45h, 60°C program.
My washer is actually a wash-dry combination. When starting the dry program after a ~1:45h, 40°C program you see that drying consumes even more energy than washing.
The fridge consumes around 80 Watt about 30% of the time too keep the fridge cool. When you look good you actually see 3 mini-spikes in the morning where I opened the fridge and the light turned on.
The electric heatpump starts heating the 150 liter hot water tank at 23:00. It ramps up to 1250 Watt. It starts exactly when electricity switches to low tariff, smart 🙂 The heatpump also heats the house and tries to keep it around one temperature level. This is the most power efficient for a well isolated house they say. The heatpump is consuming 700 Watt for this continuously when it gets colder in the house during the night.
When it gets too warm in the house the heatpump also has the ability to cool. This is less power consuming than heating.
And sometimes this heating/cooling system is just stupid. During the day it is too warm and the same night it is too cold.