Data Scientist at AWS Network Monitoring
AWS Network Monitoring is looking for a talented Data Scientist who is passionate about using data to facilitate and help drive business decisions.
You are skeptical. When someone gives you a data source, you pepper them with questions about sampling biases, accuracy, and coverage. When you’re told a model can make assumptions, you aggressively try to break those assumptions.
You have passion for excellence. The wrong choice of analysis could cost the business dearly. You maintain rigorous statistical standards and take ownership of the outcome your analyses suggest.
You do whatever it takes to add value. You don’t care whether you’re working on machine learning, blazing fast code, forecasting, distributed computing, scraping data, or building complex data visualizations, you care passionately about shareholders and know that as a curator of data insight you can unlock massive cost savings.
You have limitless curiosity. You constantly ask questions and break apart problems scientifically. You form hypotheses and then eagerly try to break them.
You value relationships and patient explanation. Part of being a trusted advisor is building a strong track record and continually demonstrating to customers that your analysis will save them money and time. You break down your model assumptions without jargon and help customers understand where your model fails.
You know and love working with business intelligence tools, are comfortable accessing and working with big data from multiple sources, and passionately partner with customers to help identify strategic opportunities and deliver results.
This role involves both hands on data mining of the vast quantities of data being generated by AWS global network to driving and implementing improvements in AWS Network Monitoring tools.
Part of the job will be setting up experiments which will involve finding appropriate data sources, retrieving and cleaning the data, and analyzing/tracking the output. You will also develop supportable conclusions based on your analysis and where appropriate contribute to their implementation.
- Collect, analyze, and present actionable data to help drive product and customer experience decisions at a senior level.
- Understand high-level business objectives and continually align work with those objectives to meet needs of business.
- Identify, analyze and solve problems at their root while maintaining focus on the bigger picture.
- Recognize and adopt best practices in reporting and analysis: data integrity, test design, analysis, validation, and documentation.
- Write high quality code to retrieve and analyze data.
- Invent and design decision algorithms based on data analysis.
- Learn and understand a broad range of Amazon’s data resources and know when, how, and which to use and which not to use.
Senior Research Scientist for Network Anomaly Detection
The Network Alerts team in Amazon is based in Dublin, Ireland. We are part of the AWS Networking organization. Our mission is to process network telemetry messages and interpret them in a way which monitors the network effectively. Our goal is to detect impact to customer traffic and fix the root cause within seconds. The network is the largest and fastest growing network in the world. The customer traffic we are monitoring is your traffic because thousands of apps and websites that you use are based on AWS.
Our traditional monitoring services are critical to the smooth running of the network and those services are truly large scale - processing over 30 million observations per second. The services are predominantly written in Java on Linux and they are large - even by Amazon standards. They are distributed over thousands of hosts in hundreds of global locations and operate at higher than "five nines" availability. In 2018 we began to incorporate anomaly detection techniques into our suite. We are using Data Science and Machine Learning (ML) approaches such as Exponential Smoothing, Distribution Modelling, Clustering, and Spatial Cosine Similarity. We have put these techniques into production and we can now detect issues which were previously undetectable - for example by dynamically choosing the right threshold for an alarm covering a million ports, or forecasting the traffic level of an internet exchange, or finding a rare natural language log among a corpus of billions. By the way, we do all of this on live time series data.
With the success of anomaly detection in 2018 we are doubling down. In 2019 we finish the implementation of 6 separate anomaly detector services and will plug them into our "fire hose" of metric observations. We will build a supervised machine learning system that will ingest an expected million anomalies per minute and make sense of them for operators. We will use statistical techniques to learn associations between anomalies, alerts and external factors. These associations will become rules in an expert system which we will build, and it will increasingly assist humans in making associations and decisions on the relationship between alerts and anomalies. We will apply unsupervised machine learning algorithms to cluster this data into incidents. Those incidents will then largely be managed by our autonomous response system and where necessary, a small number will be escalated to humans where the system will continue to learn from human actions: labeling the data so it can be modeled better.
We are looking for a Senior Research Scientist to join us in our brand new office in Dublin, Ireland. You will join a tenured team of Research Scientists and Software Development Engineers and take the lead on a number of new initiatives.