Network-Engineering | 23 min read

WLAN deMISTified

Maciej Zurawski
July 2019
written by Maciej Zurawski

The main purpose of MIST acquisition was to acquire their Cloud and AI solution, which Juniper can, later on, apply for the network orchestration as well. Wireless solutions are just an additional flavor and it looks like, this one can be really tasty.

Introduction

Recently, I had the pleasure to test Arista’s WLAN Cloud-based solution. To not be biased by the “I know one vendor” rule, I eagerly jumped on to the next assignment, which was checking out what is up with Mist, the latest addition to the Juniper Networks portfolio. Mist Systems is actually a name of the company which was recently acquired by Juniper Networks, which advertises itself as a provider of the AI-driven, Cloud-based Wi-Fi solution with high effort put into the BLE based location services. Small tip, if you are among German-speaking people, please do not start the conversation with that company name, trust me on that. (Zwinkern) Since this starts to look like a series of articles, I think I would follow up with the previous form and just go through all features I was able to try and test. Oh, by the way, we were working on the version 0.6.17990.

Similarly to Arista (Mojo), Mist servers hosting whole Cloud service are deployed in AWS in multiple VPCs and security groups. The servers are protected by firewalls and only required ports are opened on the front end servers or terminators that need to communicate directly with APs or APIs from outside. Unlike Arista, Mist does not provide any other management option than using this Cloud subscription model. If you are waiting for a virtual controller or on-premise appliance, then I have bad news – there are no plans for that currently. There is Mist Edge, but its purpose is to bring some of the Cloud microservices closer to the campus edge, so that some of the network functionalities can be done closer to the end-users, lowering latency and addressing some architectural demands. This sounds a little bit like a controller, but it is not. You still need connectivity to the Mist Cloud to manage your network. To operate Mist Edge uses L2TPv3 to tunnel traffic between WLANs and the edge elements. We did not get the possibility to test it, I can only base my evaluation on the datasheet and specification. Some example use cases:

  • Split tunneling for guest access and corporate traffic
  • Seamless roaming for large campus networks (via localized tunneling)
  • Extending VLANs to distributed branches and telecommuters to replace remote VPN technology
  • Wi-Fi backhaul for IoT devices

Mist Cloud is built on a microservices architecture and some of the services are:

  • Marvis – AI virtual network assistance
  • RRM – allow viewing channel and transmitting power across all APs on the network
  • Anomaly detection – sets dynamic thresholds on the network SLEs
  • Location engine – leverages BLE to triangulate the location of person/resource
  • AP terminator – communicates with the APs

Testing Inventory

AP41 Access point

In our test, we wanted to use an AP model which would have the same or comparable performance to the C-130 we tested for Arista, so we went with the 3x AP41. AP41 is the 4×4:4 MU-MIMO/SU-MIMO tri-radio access point. Similar to Arista, a third radio is used for the security scanning (dual band). The AP41 supports data rates of up to 1.733 Gbps and 800 Mbps, for the 5GHz and 2.4GHz respectively. Supplementary to the package is a BLE antenna array consisting of 16 antennas, used for location services. Other characteristics consist of two Gigabit Ethernet ports, which can be used as a port bundle or to extend SSID over the wired network. Additionally, the ports can be used for the PoE pass-through, which allows the device to give some of the unused allocated power to a daisy-chained device. A good example of that is AP41, which requires 802.3at PoE+ (25,5W) but can request full 30W and then power up connected BT11 which requires only 6W (since it only contains a BLE antenna array). Worth mentioning is that AP41 will boot on 802.3af power but will be unstable. Up to the time of writing this article, this is the most powerful AP in the fleet and can be called the MIST flagship.

BLE 16 antenna array

For BLE location services APs should be less than 5 meters from the ground and mounted horizontally and downwards.

The Mist APs are supporting both options for the power negotiation in the LLDP-MED:

  • MDI TLV IEEE 802.3-2015
  • MDI TLV IEEE 802.1AB-2009

The preferred option is, of course, the newer one which contains more information.

Getting started

Initial Setup

In order to start playing around with the MIST products we need three key components at the beginning of our journey:

  • organization and site configured in the Cloud (and of course our own, personal account)
  • AP connected, powered up and reachable from the Cloud
  • AP claimed by the proper organization

Organization

Since the APs are claimed by the Organization, it would be good to have at least one during that process. The organization is a logical grouping of assets, both physical (APs, sites) and logical (networks, rules), it will tie these assets together via rules and policies. In most simple cases, an organization is an entire company, but it’s not limited to that, it just the question of how far you want to go with overcomplicating things. The organization is created at the initial login to the newly created account or can be created directly from the account profile page (“Utilities” drop-down menu) if you want to manage many of them. As far as I can see, there’s no limitation on how much organization one person can create.

The organization configuration is not very extensive, it just consists of:

  • Password and session policies – governs access to the organization, 2FA is supported
  • Management connection – how APs will connect to the Cloud (in most of the cases DHCP option is enough)
  • Support access – opting in or out of the Mist support access to your organization
  • Mist and RadSec certificates – a place where RadSec certificate can be uploaded and Mist certificate (used by AP) retrieved
  • Single sign-on – requires adding IDP, google is supported without any additional configuration
  • Webhooks – for the streaming of organization-wide alerts (but I was not able to trigger any to verify if it works…)

Small remark: when supporting multiple organization, Mist partners may request MSP (Managed Service Provider) level access, which allows an overview and access of multiple organizations within a single MSP dashboard. The one obvious limitation is that a given organization can belong only to one MSP.

The site is the second required entity for the basic WLAN setup. All APs claimed by the organization have to be placed in one of the configured sites, so that they can inherit common settings and firmware. The site requires some basic information before it can be created: namely timezone and location, which is used to apply country-specific RF constrains. At the site level, we assign RF templates which specify RF properties of access points residing on a particular site and configure BLE options. As an organization grows and gains more sites the concept of groups can be useful. Groups are based on our own internal matching criteria and can be viewed as a label for sites. Because of that sites can exist in multiple groups at the same time.

Claiming APs

An AP can be claimed by the Organization using one of three methods:

  • manually adding each AP in the portal
  • scanning QR code using MistAI mobile application
  • using API

Since I had only three APs to play with, didn’t create an API environment setup then and have internal distrust to the mobile applications, I chose the manual assignment. It was rather straight forward and gave me a name generator so I could start using naming convention from “day-one”.

There’s a 4th option I did not mention in the beginning, since I had no possibility to check it out when the order was placed. Mist should send an email containing an Activation Code. Entering your activation code on the subscription page will claim all the APs onto your organization and they will be put in an “Unassigned” state.

Setting up APs

Similarly to Arista, connecting APs to the network requires only to provide them with the correct port, which will provide them with the required power level (PoE+) and network connectivity. In case of Mist AP, they will not operate gracefully on the lower power levels, as they are not disabling any features to adapt to worse conditions. So it is important to assure full power, otherwise, it can impact device performance. APs will assume that they are connected to an untagged port (or tagged port with native VLAN configured) and will start to broadcast DHCP discovery messages to acquire an IP address and try to connect directly to the AP terminator in the Cloud. When connecting APs to the network we can encounter some issues. APs can communicate encountered problems with a certain LED blink pattern. To access the info page with the patterns explained, you can click the AP status icon, but only when it is not “Connected”.

We can quickly check if the Cloud is accessible from your network, to check for example if no firewall is blocking the traffic, we can try to access the AP terminator (service used for the connectivity between APs and Cloud).

$ curl ep-terminator.mistsys.net/about
{
   "version": "'0.3.4104'",
   "git-commit": "'7e6162cbba24ad768037d34e044cc31057ffb01d'",
   "build-time": "'2019-08-06_22:00:31_UTC'",
   "go-runtime": "go1.12.7",
   "env": "production",
   "procname": "ep-terminator/🌧/env=production/host=ep-terminator-172-31-0-43-8ac18a16-production.mistsys.net/pid=5681/user=terminator",
   "start-time": "2019-08-07T22:36:35Z",
   "uptime": 52545.039132
}

Same LED which we look at during connectivity problem can be used as well to physically locate AP, since when activated “locate” feature, it will start to blink in the purple color. Of course, all the leg work needs to be done manually, there is no possibility to automate or use AP for that, (Zwinkern) you need to get up and find that blinking box.

After everything is setup up correctly we should see all green in the “AP inventory” and then we are ready to move on.

One last thing in this section. If we feel alone in managing our own organization, with Mist we are not creating accounts for others, we “invite” them. (Zwinkern) As far as I see, there are four permissions levels you can assign other admins:

  • Super-user – can do anything, the user creating the organization by default is granted with this permission level
  • Network Admin – has full access to the selected sites, but cannot configure organization-wide properties (excl. creating sites)
  • Observer – has monitor-only access to the selected sites
  • Helpdesk – has limited capability for troubleshooting, but cannot reconfigure any site or organization-wide properties

If there is IDP used for SSO, it is important that the roles supplied by the provider need to be mapped to the internal access roles on the Mist portal.

Setting up WLAN

Wireless networks can be configured from templates and then inherited (by the organization, specific sites or site groups) or manually on a site-by-site basis. They don’t really differ in terms of configuration options, as both provide the same capabilities. It is just easier to use templates when you need to create multiple sites having the same properties for the Wireless configuration since it can be inherited or used by the API scripts. Configuring SSID is rather simple, as it consists of setting radio band, security options and rate-limiting for all users but there are some interesting options I would like to mention. The whole configuration can be assigned only to specific APs via the label system.

Labels

Labels are assigned automatically to all clients which would connect to the configured WLAN network. They can be used later on for security policies.

Disable Static IP devices

Option No Static IP Devices will invalidate clients which initially do not send DHCP request.

Data Rates

There are four options in the data rate section:

  • Compatible (allow all connections)
  • No Legacy (2.4G, no 11b)
  • High Density (disable all lower rates)
  • Custom Rates

Opinions are divided if low data rates are disabled or left unattended for the backward compatibility. Disabling lower legacy data rates can improve the performance of the WLAN. In the 2.4Ghz band, the goal is to get rid of the 802.11b clients, since they will enforce the use of the non-OFDM data rates which implies additional protection mechanisms (RTS/CTS, CTS-to-self, etc.) impacting overall performance. Most 5G capable client drivers have been written to accept any rate or set of data rates that an AP is announcing as Basic. Furthermore, there are still a few 5G capable client drivers out there that will only accept 6, 12, or 24M, but nothing else. The option custom rates give the possibility to disable specific 5GHz rates. The High-density option will block legacy 2.4GHz clients and additionally all clients whose signal strength is weak. These options should be used with caution as, if some legacy devices are expected in the network, they might get blocked from connecting to the BSS.

WLAN Rate Limit

This gives admins possibility to limit upload/download throughput for the whole network on a per-client basis or for the particular predefined application (admins have no possibility to create their own application, as far as I was able to see).

SSID Scheduling

This option works similar to the one we already described in the Arista article, but there’s an important comment which needs to be made. Currently, when setting an SSID schedule the APs radios will restart.

Custom Forwarding

The default behavior of an AP is to send traffic as tagged or untagged (depends on the configure VLAN option) through the primary Ethernet port Eth0, but there’s the possibility to specify an outgoing interface on a per-network basis. Traffic can also be sent via L2TPv3 tunnel.

Isolation & Filtering

This section gives a little bit of control over how “east-west” traffic will be treated. We can completely prohibit host on the same WLAN network to communicate with each other. Broadcast and multicast traffic can be blocked off completely as well, with some exceptions configured, so that admins won’t break network completely in the process. These exceptions are:

  • do not block ARP (can be configured separately and will influence proxy ARP behavior of AP)
  • do not block DHCP
  • do not block ICMPv6 (ND)

When multicast filtering is enabled, an additional option is revealed to allow mDNS packets, for the Bonjour setups.

Security

Mist gives common authentication options for the WLAN network:

  • WPA-2/PSK (single or multiple passphrases)
  • WPA-2/EAP
  • Open
  • MAC authentication (working similar as on the switches, would send device MAC address as user and password to the Radius server)

When I was looking through the documentation I came across an interesting sentence:

Mist does not expose all functionality to all users. Newer features might be exposed to “beta” users who have asked for access or need access. 
There might be firmware dependencies. Security items such as WEP and TKIP have been superseded by more secure protocols and will only be exposed if a customer has a specific requirement.

An interesting option is the WPA-2/PSK with the multiple passphrases, it introduces the concept of Personal WLANs. Personal WLANs are segmenting single WLAN into smaller networks, each having unique access key (PSK) within single SSID. These WLAN segments are isolated from other Private WLANs on the same network. Passphrases can be configured for the multiple devices, or specific devices (specified by the MAC address).

In cases of Open mode, both external and internal captive portals are supported. Internal ones can be created directly from the Mist management portal; it’s rather simple, even simpler then what Arista provides, probably due to being less complex, but still usable. There’s no customer engagement, but more authentication options are available, probably as long as they are not concern about GDRP or IDP policies.

To be honest, I had great problems to make it work properly. I’m not sure what the problem was since it started to work on its own after some time (maybe it takes a while to synchronize the AP or I did not click “save” somewhere), my symptoms were clients loading captive portal and being instantly authenticated without any action. They had full access, but portal claimed that they are actually not authenticated at all. (Zwinkern) The only thing that I changed was adding the option to isolate clients on the network level, maybe that did the trick. 

Policies and security

Policies are a rather simple concept but it took me some time before I finally grasped it, since I was so focused on the middle part of policies. I was surprised that only the general rule (to all resources) can be a block rule. As soon as you add any resource label, the rule action changes into allow. At the beginning I thought it was some sort of a bug and wanted to file a support case… And good I didn’t do that, after fiddling in the configuration of labels for some minutes, I started to see how the pieces fit in together. Still, I think it is rather counter-intuitive for people used to firewall security policies. Only the first rule which matches user labels is applied (all labels on the user side must match). That’s why only the general rule can have block action, so that is the equivalent of implicit accept/deny. User labels can be assigned based on various criteria like:

  • AAA Attribute
  • WiFi Client (by Mac or Name)
  • WLAN
  • Access Point

Since there’s only one rule that has to match all the user traffic, allow/block actions are taken on the per resource label basis… The resource labels were another “Hmm”-moment for me. A resource label consists of name, type, and value. Furthermore, the value can be split into a quantifier (IS/NOT – most of the label types can specify on IS option) and the value based on the type, so in the end, it would be eg. “IS port number 80”. Predefined Application labels have only denied action, admins have the possibility to create custom application labels, but they are limited to “NOT” matching criteria and still needs to be selected from Mist supplied list. Taking all that into account I do not fully grasp usability of such labels and how they suppose to work: “When a rule is matched, not block application X”? This is certainly something which requires more testing in the future, but I think in the 99% of all use-cases predefined labels are enough to set up most policies to block time and resource wasting applications.

Since WLAN security is a hot topic right now, every vendor is trying to incorporate new security features that emerge on the market. To cope with that, Mist has, similarly to other Cloud-based WLAN providers, installed an additional radio dedicated to security scanning. This radio is used for detecting Rogue, Neighbor, and Honeypot APs on both 2.4GHz and 5GHz radio bands. Scanning should identify:

  • Honeypot APs – are unauthorized APs advertising our configured SSID and it is enabled by default
  • Neighbor APs – unknown APs in the proximity of our network (within RSSI threshold, default is -80dBm)
  • Rogue APs – unauthorized APs (not claimed by the organization), but connected to the same Local area network and it is not enabled by default

Additionally to that, the customer would be classified based on the APs to which they connect. All clients connected to the Rouge APs would be Rouge Clients (Watched Clients). To prevent sanctioned APs from being identified as Rogue or Honeypot, specific SSID/BSSID can be whitelisted in the security configuration.

Webhooks

Webhooks allows admins to collect data in the real-time by the means of events which push information onto a provider URL. Triggered webhook events will automatically send and store relevant details for data analysis. Since Mist is not supporting SNMP on their APs, webhooks can be used as the part of the monitoring system, which can replace SNMP trap based system, since audit and device events are implemented. Webhooks can be configured both on the site and organization level and depending on which level they are configured, they will handle different events. All webhooks are available for configuration at a site level and currently, UI (portal) only displays location-related webhooks. The API does not inherit the same configuration constraints as UI in regard to webhooks. In fact, you can configure all available webhook options using API, including audit and device events.

Site-level webhook events

Organization level webhook events

{
    "topic": "location",
    "events": [
        {
            "mac": "4c57ca9553c6",
            "map_id": "3c0eea4c-d7fe-4250-bc86-13b72294aae1",
            "rssi": -56,
            "site_id": "7d17bce9-9eac-416c-9fe4-55ea3c6db410",
            "timestamp": 1565609329,
            "type": "wifi",
            "x": 5.706772,
            "y": 20.428185
        }
    ]
}

BLE Location services

Mist uses its patented BLE array (vBLE), which consists of 16 directional antennas, from which 8 are reflectors (to help direct BLE energy outward away from AP), 8 are directional antennas that point in eight different directions and transmit BLE energy (beams). The Mist SDK on the device hears these beacons from the beams and sends the RSSI as well as the device sensor information back to the Mist Cloud. Mist builds then location from “probability surface”, so the more information is provided (more beams from APs), the more accurate the location information is. There’s as well machine learning engine constantly running on the input information, so it can evaluate the Path Loss Formula (PLF) per device type. Based on the configured services, vBLE array can be in the listener (vBLE engagement) or transmitting (Asset Visibility) mode or both if both options are selected. For the AP to start transmitting beams, it needs to be placed on the floor plan. Mist is not limited to only their own device, as they will support 3rd party BLE, but would not be able to monitor their health, as with AP. Since I could not arrange a device with an application using the Mist SDK I was not able to test BLE location-based services fully (traurig) and it is as well a verbose topic to just go with the Mist Experience App since it would be just scratching a surface.

Maintenance

In the normal business operations, if using the UI, the first thing that will greet each admin would be a dashboard, which gives a quick recap on the health state of the selected site.  It includes information of AP assigned to the location with all associated clients, as well as aggregated post and pre-connection statistics (like DHCP/DNS latency, TX/RX data rates). Directly from here, all information can be accessed for troubleshooting, we can go directly to the client or AP insight and see last events. If we want we can switch to the wireless network view, which will instead show us graphs based on the SLE (look at this as better SLAs (Zwinkern)). There are various SLE metrics tracked:

  • Time to connect: tracks the number of connections that took longer than the specified threshold to connect to the internet, which calculates the time between the start of association and the point where the client is able to send and receive data.
  • Throughput: tracks the amount of time, that clients estimated (probabilistic, based on the wireless conditions) throughput to be below the threshold.
  • Roaming: tracks the amount of successful roams between 2 APs.
  • Coverage: tracks the amount of time, that clients RSSI measured by the AP is below the threshold.
  • Capacity: tracks amount of time, that clients experience bad capacity (interference or load).
  • Successful connects: tracks the percentage of successful connections by the client to the network
  • AP uptime: tracks amount of events causing AP to lost Cloud connectivity (reboots, WAN issues, etc.)

Each of the SLEs can be accessed further to get more detailed information on particular classifiers, including affected items and timelines.

So I can see that look-and-feel is very similar to the Cognitive Wifi from the Arista. We get a ton of information already processed by us and it is up to us to use it properly, draw accurate conclusions, fix the problem and be admired in our own organization. (breites Grinsen) Some additional information, packet captures are done automatically for the bad events if possible, since most of the issues I faced were related to the authentication. I actually did not get any, but an example of such an event could be DHCP issues since it would capture full DORA exchange. There is as well the possibility to grab packet captures on demand if troubleshooting requires that. Unlike Arista, there is no embedded function to parse and analyze captures, so for that, we have to still rely on our beloved ((Herz)) Wireshark.

Marvis

Marvis is the virtual assistant and according to Mist is complementary to their SLE framework. This AI-driven (buzz word detected (Zwinkern)) assistant should accept natural language or guided queries to provide insight on the network status. One of the points of the natural language is to not to try to memorize all of the show/select commands, but rather try to ask normal questions to Marvis and get the proper output we expected. Since I didn’t have the chance to do extensive troubleshooting I was just playing around with a few queries, but after a while, I ended up checking what was in the documentation. After a few days, you will most likely grasp the concept but for me, it felt a bit off. Maybe Marvis is more human then I am (Zwinkern) or maybe people native to the English language would find it easier working with a soulless assistant which we cannot really blame for anything. Below a few examples of queries to the Marvis:

Admin notifications

Admins can have email alerts configured on the per-site basis. Currently, there are seven types of events which will generate predefined email alerts:

  •     DNS Server xx.xx.xx.xx Failure Detected
  •     DNS server xx.xx.xx.xx Recovered
  •     DHCP Server xx.xx.xx.xx Failure Detected
  •     DHCP Server xx.xx.xx.xx Recovered
  •     Device Restart Events Occurred
  •     Device Cloud Connectivity Lost
  •     Device Cloud Connectivity Recovered

Examples of alert emails:

Final thoughts

Throughout the whole testing phase, I was using Mist documentation very extensively and I must say it is really on a high standard, compared to the ones I faced during Arista (Mojo) testing. But I must admit that the problem with Arista was the ongoing migration from one portal to another and only that, not the actual content quality. I can advise everyone starting with Mist to at least finish the internal course on the portal to establish a beachhead in the Mist technologies since it takes only around 13 hours to finish all content.

Just to make this summary short and not extend this lengthy article, I will use simple bullet points.

What I did not like:

  • No on-premise option (and no plans for one) and Cloud lock-in
  • Marvis is not so user-friendly as Mist wants us to believe, or maybe it’s just me…

What I liked:

  • Very easy and mostly intuitive to configure
  • API has the same functionalities as a Cloud portal
  • State of the documentation
  • Dedicated scanning radio
  • Locations services seem like an interesting feature

A short disclaimer at the end, you should regard this whole article as “average Joe” experience into configuring an enterprise-grade wireless solution. (Zwinkern)

Abbreviations

2FATwo-Factor Authentication
AAAAuthentication, Authorization, and Accounting
APIApplication Programming Interface
AWSAmazon Web Services
BLEBluetooth Low Energy
BSSIDBasic Service Set Identifier
CTSClear to Send
GDPRGeneral Data Protection Regulation
IDPIdentity Provider
MSPManaged Services Provider
OFDMOrthogonal Frequency-Division Multiplexing
RFRadio Frequency
RRMRadio Resource Management
RSSIReceived Signal Strength Indicator
RTSRequest to Send
SNMPSimple Network Management Protocol
SLEService Level Expectation
SSIDService Set Identifier
SSOSingle Sing-On
3+
Maciej Zurawski
July 2019
written by Maciej Zurawski

Like this article? | Share it with a colleague