Question
Continuation of last years project: Machine Intelligent Enterprise WiFi asaservice (WaaS) Previously, a set of algorithms had been developed to mine historical networking data collected
Continuation of last years project: Machine Intelligent Enterprise WiFi asaservice (WaaS)
Previously, a set of algorithms had been developed to mine historical networking data collected from Wi
Fi Access Points (AP) and end user client devices. The algorithms were aimed at automatically setting
complex radio parameters/configurations to improve end user Quality of Experience (QoE) in near real
time.
Initial algorithms were based on known physical layer relationships and configuration experience by our
experts.
During field trials, it was found that the theoretical relationships between these key parameters (RSSI,
SNR, Data rates, Cell size, etc.) have significant errors and variations due to real life overtheair RF
impairments and inaccuracy in devices low level software driver implementations by the equipment
vendors.
The need for dynamically learning the relationship of the different collected parameters to device QoE
lead to the investigation of latest deep learning technologies as applied to WiFi network optimization.
Challenges were in the areas of:
o total number of distinct networks that can be managed and the ability for the algorithms to cross
radio and LAN protocol layers were present and required architectural rethinking.
o need to deeply manage several different WiFi technologies (Broadcom, Qualcomm, etc.) using a
generic approach.
o scaling to increase the number of dynamic events that can be handled.
Architecture/Design:
o The WaaS Cloud architecture is composed of: a data collector engine, a big data storage engine,
machine learning servers, and an expert system.
o The expert system uses a set of rules to correlate incoming network events such as device
attachments, roaming, disconnects, etc. The purpose of the rule set is to automatically detect
connectivity anomalies in real time and establish root causes.
During field deployment, it was found that an unexpected large amount of events were received from
specific customer networks causing serious impact in processing load to the cloud servers. The need for
investigating the root cause of the large volume of network events and rethinking the event processing
architecture became urgent in order to contain cloud computing server requirements and avoid
prohibitive cloud cost increases.
Machine learning technologies have made significant progress not only reserved to academia and
research but also to industrial development. A study of mainstream deep learning algorithms from opensource
projects in R, Python and Tensorflow was undertaken. After several evaluation prototypes, the
Tensorflow approach was selected due to its maturity.
As a first significant exercise, all collected network element statistical data must be normalized for
consumption by the deep learning algorithms. A normalization engine was developed to transform
hundreds of different variables to a common range of 1 to +1.
The normalized dataset was then used to start the training process of the different deep learning
algorithms for evaluation purposes. The training approach was verified by using collected data from the
internal office network running the existing cloud algorithms. Once trained, the deep learning approach
could produce very similar results proving the basic process of normalization and training.
The next step was to develop a selftraining framework composed of several deep learning algorithms
feeding a QoE evaluation algorithm. The following combination of deep learning approaches were
selected: Long Short Term Memory (LSTM), QLearning, BellMan Equation.
Testing of the framework resulted in the following:
o the QLearning inputs predicts what AP configuration changes should optimize device QoE for a
unique device at that Location, Time Frame and Traffic Model.
o the QLearning predictions are acted on with AP configuration updates to create a feedback loop
to QLearning that over time will selfoptimize given complex overlapping constraints.
o The QoE scorer extrapolates what the EndUser Device QoE is and feeds back the results to the
framework.
The multialgorithmic framework was implemented in Python and deployed in the cloud.
After several iterations, the deep learning algorithms achieved convergence towards AP radio
configurations. However a significant roadblock was encountered in the computing power required to
train these algorithms.
For a small amount of APs (~11) and users on the network (~33), it took several hours (5+) of computing
to train the model before it can be used for prediction. Work migrating the CPU based computing virtual
machines to matrix computation optimized virtual machines using GPUs was started and will continue in
the next year.
Different QoE prediction algorithms were developed to complete with human expert formulas sidebyside.
The main technologies experimented with were polynomial linear regression, neural networks,
random forest (decision tree), recursive partitioning, and regression trees.
At the end of the year, results of the QoE predictors were still not as consistent as humans perceived
experience. The root cause is likely attributed to invalid input data in the training data set. Investigation
will continue in the next year.
As a result of very different customer segments (Education, Medical, Retail, etc.) the client device
behavior was found to be very different. As an example retail deployments use devices like scanners
which generate a very large amount of WiFi signaling events to the cloud. It was observed that during
peak usage, the system was overwhelmed by the volume of events and caused unacceptable processing
delays.
A new architecture of event processing was studied and implemented providing event preprocessing in
the local AP agent before being sent to the cloud. The preprocessing engine dramatically reduced the
amount of event volume in the case of successful client device transactions while maintaining the
necessary detailed information in the case of failures. The new algorithms were released to the field and
significant reduction in the event processing and associated cost in the cloud.
By the end of the year had achieved:
o Implementation (Proof of Concept) of deep learning used to selflearn the relationship between radio
statistics and end user experience optimization. The work will continue next year to optimize the deep
learning processing with a goal of making live deployment commercially possible.
o Rearchitecting of device transaction event processing between the Cloud Expert system and the local AP
agents was completed and released to the field with good initial results.
242 What scientific or technological uncertainties [A1] did you attempt to overcome uncertainties that could not be removed using standard practice? (Maximum 350 words)
[A1]Guideline for completing Box 242. This section should answer the following 4 questions (usually in the order listed here):
What is the project about? Very briefly describe the project in 1-2 lines. You can also describe project objective here.
What was the biggest challenge/uncertainty?
What was the standard practice before starting this project and WHY wont it work?
Was there any due diligence work performed?
NOTE: The most important aspect here is to clearly define the uncertainty (question #2)
244 What work did you perform in the tax year to overcome the scientific or technological uncertainties [A1] described in Line 242? (Summarize the systematic investigation or search) (Maximum 700 words)
[A1]Guideline for completing Box 244. This section is intended to show the systematic investigation nature of the work. Another way to look at systematic investigation is to think of each activity using the scientific method as follows:
Identify a problem - usually an uncertainty from Box 242
What was the idea to be tested (hypothesis)?
How was the hypothesis tested?
What was learned (results and conclusions)?
246 What scientific or technological advancements [A1] did you achieve as a result of the work described in Line 244? (Maximum 350 words)
[A1]Guideline for completing Box 246. The advancement can and should be detailed in two ways:
What new capability do you have?
What new KNOWLEDGE did you gain in order to achieve that capability?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started