Any talk of data science trends typically features a familiar cast of characters. Articles chronicling the data science revolution explore big data, the cloud, artificial intelligence, machine learning, the internet of things, and “the edge” grace the pages of websites and magazines. Data science is the future.
Once considered a niche discipline involving mathematics, statistics, and computing, data science is now a core function of business, society, and government. A niche no longer, the mainstreaming of data science is the past decade’s most significant data science trend. As data science matures, data science trends converge and intersect. This intersection meets where the Fourth Industrial Revolution begins.
Highlights: Trends in Data Science
Data science is constantly evolving. Here is our current short list of data science trends.
Automated Machine Learning
Machine learning and “AutoML” are subsets of Artificial Intelligence. Building computer models is a time-consuming, iterative process. AutoML relieves the need for human intervention in these repetitive tasks. Data scientists can build production-ready ML models at scale.
The Internet of Things
Most know IoT with names like “Alexa” or “Siri.” But the Internet of Things is also hard at work in health care, the smart grid, industrial manufacturing, and autonomous vehicles. Its function as an interface between the physical and digital world gives IoT a primary role in a new energy economy and Industry 4.0.
Edge Computing
As the name implies, edge computing decentralizes computing power and brings data processing capacity to the edge, near where it is needed. Edge computing makes the Internet of Things possible by moving the processing power to where the data is generated instead of the other way around. Edge relies less on the cloud to perform automated tasks.
Most of us use edge computing in our daily lives without necessarily realizing it. Smartphones, wearables, gaming systems, and other everyday devices operate at the edge. But industries from healthcare to security, energy, manufacturing, and retail are adopting edge computing within their operations.
Beyond the edge, “fog computing” harnesses substantial processing power and data storage without relying on the cloud–the edge on steroids.
The Cloud
Its expanding ubiquity is perhaps the cloud’s most dominant trend. Like other technologies well underway before 2020, cloud service utilization jolted forward with the pandemic. Remote and increasingly flexible work accelerated demand for cloud-based services.
The surge continues with projected spending on cloud services in 2022 to reach $482 billion, up from $396 billion in 2021.
As the cloud grows, so do concerns about server farm energy efficiency and the cloud’s climate impact. Data centers are already large adopters of renewable energy.
Small Data and Tiny Machine Learning
Centralized computing power, data throughput, and storage enable enormous machine learning algorithms to model vast knowledge areas. For instance, the deep learning algorithm driving GPT-3 language model contains roughly 175 billion parameters.
Even for large enterprises, big data involves a significant resource demand that often is not required for the task at hand. Small data and tiny machine learning use energy-efficient microprocessors that require little memory or computing power to accomplish targeted data processing and modeling tasks.
As with edge computing and IoT, small data and tiny machine learning decentralize the power of data science. Put another way, small data and tiny machine learning democratize and humanize data.
Everyday use cases include:
- The Internet of Things
- Monitoring local environmental conditions
- Drone navigation
- Intelligent motion detection
- Image and object recognition
Generative AI
Of the many directions AI is headed, generative AI is arguably one of its most fascinating and unsettling trends. ITVibes defines generative AI as enabling the “creation of new content including text, images, and audio/video,” adding that “one could be tricked into believing the artificially generated content is real.” Generative AI is also known as deepfake AI or synthetic media.
“Reality is becoming mutable,” begins an article in the Independent describing deepfake videos of Tom Cruise on TikTok. As such, it represents the potential risk of unchecked, unethical application of data science, notwithstanding the many legitimate uses of generative AI.
Data Science at Merrimack College
At Merrimack College, data scientists, analysts, and engineers have several avenues available to upskill their talent, advance their careers, and help lead in the Fourth Industrial Revolution.
The online Master of Science in Data Science degree program teaches students the practical skills and theoretical understanding they need to succeed amid a data science revolution.
The top-rated, industry-aligned curriculum covers six core data science skill sets and learning objectives:
- Formulating Problems
- Collecting and Processing Data
- Analyzing and Modeling Data
- Presenting and Integrating Results into Action
- Real-World Applications of Data Science
- Capstone
These topics are examined over 6 required and 2 elective courses. One elective must be from the Real World Application group. Each course is four credits for a total of 32 credits.
Graduate Certificates in Data Science
The Data Science Certificate introduces students to the discipline of data science by developing foundational skills in extracting decision-guiding insights from data.
- In DSE 5001 Introduction to Data Science and Statistics students will learn what data science is, what data scientists do, what types of problems data scientists solve, and the fundamental statistical techniques that are used.
- In DSE 5002 R and Python Programming, students will use the coding languages R and Python to collect, explore, clean, wrangle, and summarize large data sets.
- In DSA 5400 Visual Data Exploration, students will use industry-leading software to “tell the story of the data” by creating graphical summaries with Tableau and interactive dashboards with R Shiny.
This certificate is geared toward professionals who are interested in acquiring data literacy and developing the core data analytic skills that are the foundation for careers in data science. It assumes some prior STEM (science, technology, engineering, mathematics) coursework and/or quantitative aptitude.
Data science is an exciting field and a lucrative career. More importantly, it is critical to a flourishing, sustainable future and prosperous economy.