A data-driven exploration of the 2023 job market, uncovering where the highest salaries live, which skills employers actually want, and what an aspiring data analyst should learn first to maximize both opportunity and earning potential.
This project is an end-to-end SQL analysis of the data analyst job market, designed to answer one core question: what skills should an aspiring data analyst prioritize to maximize both employability and earning potential? Using a real-world dataset of job postings, I wrote five structured SQL queries, progressing from basic selects to multi-CTE joins, to systematically uncover what top-paying companies want, which skills dominate job listings, and where the sweet spot between demand and salary lies. The project demonstrates practical SQL fluency across filtering, aggregation, window functions, CTEs, and multi-table joins.
The salary spread is massive. The range runs from $185K to $650K, a $465K gap within the same job title, which tells you compensation is heavily driven by company type and industry, not just the role itself. Mantys ($650K) and Meta ($336K) sit far above the rest as clear outliers. Strip those two out and the realistic top-tier ceiling is around $185K to $255K, which is still exceptional. Tech and finance dominate the list with AT&T, Meta, Pinterest, SmartAsset, and Patterned Learning AI making up most of it. Every single role here is remote and full-time, reinforcing that the highest-paying opportunities have fully embraced remote work. Roles appeared throughout 2023 with no strong seasonal cluster, meaning the top-paying market stays relatively active year-round.
SELECT job_id, job_title_short AS job_title, company.name AS company_name, job_location, job_schedule_type, job_posted_date, ROUND(salary_year_avg, 0) AS salary, CASE WHEN job_work_from_home = TRUE THEN 'remote' ELSE 'on_site' END AS job_type FROM job_postings_fact AS job_postings LEFT JOIN company_dim AS company ON job_postings.company_id = company.company_id WHERE job_title_short LIKE '%Data Analyst%' AND job_work_from_home = TRUE AND salary_year_avg IS NOT NULL ORDER BY salary DESC LIMIT 10;
The $180K to $255K salary tier consistently requires a combination of SQL, Python, and Tableau as the foundation, with cloud skills like AWS, Azure, and Databricks separating the highest earners. AT&T's $255K role required 13 skills, including PySpark and Pandas, suggesting that the top of the market rewards analysts who can operate across the full data engineering stack, not just analysis.
WITH top_paying_jobs AS ( SELECT job_id, job_title_short AS job_title, salary_year_avg AS salary, job_location, job_posted_date, company.name AS company_name FROM job_postings_fact LEFT JOIN company_dim AS company ON job_postings_fact.company_id = company.company_id WHERE job_work_from_home = TRUE AND job_title_short = 'Data Analyst' AND salary_year_avg IS NOT NULL ORDER BY salary_year_avg DESC LIMIT 10 ) SELECT tpj.*, s_skill.skills AS skill_name FROM top_paying_jobs AS tpj LEFT JOIN skills_job_dim AS j_skill ON tpj.job_id = j_skill.job_id LEFT JOIN skills_dim AS s_skill ON j_skill.skill_id = s_skill.skill_id ORDER BY tpj.salary DESC, s_skill.skills;
SQL dominates with 92,628 job postings, nearly 40% more than the second most demanded skill, Excel (67,031). The top 5 in order are SQL, Excel, Python, Tableau, and Power BI. This confirms the classic analyst stack and highlights that foundational skills still drive the volume of hiring, even if niche skills command higher salaries.
WITH skill_demand AS ( SELECT sj.skill_id, j.job_title_short, COUNT(*) AS total_skill_count FROM job_postings_fact AS j INNER JOIN skills_job_dim AS sj ON j.job_id = sj.job_id WHERE j.job_title_short ILIKE 'Data Analyst' GROUP BY sj.skill_id, j.job_title_short ) SELECT sd.job_title_short, s.skills AS skill_name, sd.total_skill_count FROM skill_demand AS sd INNER JOIN skills_dim AS s ON sd.skill_id = s.skill_id ORDER BY total_skill_count DESC LIMIT 5;
When sorted purely by average salary, niche and specialized skills like SVN ($400K), Solidity ($179K), and Couchbase ($160K) top the list, but these are driven by very few job postings and should be treated cautiously. The more actionable insight is in the $110K to $135K band: Kafka, Spark, Airflow, Snowflake, and Databricks represent cloud and big data skills with meaningful job volume and strong pay.
WITH skills_in_demand AS ( SELECT j.job_title_short AS job_title, sj.skill_id, ROUND(AVG(j.salary_year_avg), 0) AS avg_salary, COUNT(*) AS total_skill_count FROM job_postings_fact AS j INNER JOIN skills_job_dim AS sj ON j.job_id = sj.job_id WHERE j.job_title_short ILIKE 'Data Analyst' AND j.salary_year_avg IS NOT NULL GROUP BY j.job_title_short, sj.skill_id ) SELECT sd.job_title, s.skills AS most_in_demand_skills, sd.avg_salary FROM skills_in_demand AS sd INNER JOIN skills_dim AS s ON sd.skill_id = s.skill_id ORDER BY sd.avg_salary DESC;
Combining both dimensions reveals the true priority list. Python ($101K avg, 57K postings), Tableau ($97K, 46K postings), and Snowflake ($111K, 6K postings) offer the best balance of job availability and salary. AWS ($106K) and Azure ($105K) are strong runners-up. SQL itself pays $96K average across 92K postings, lower relative salary but unmatched job security. The insight: SQL gets you in the door, Python and cloud skills get you paid more.
WITH highest_paying_skills AS ( SELECT j.job_title_short AS job_title, sj.skill_id, ROUND(AVG(j.salary_year_avg), 0) AS avg_salary FROM job_postings_fact AS j INNER JOIN skills_job_dim AS sj ON j.job_id = sj.job_id WHERE j.job_title_short ILIKE 'Data Analyst' AND j.salary_year_avg IS NOT NULL GROUP BY j.job_title_short, sj.skill_id ), skill_demand AS ( SELECT sj.skill_id, j.job_title_short, COUNT(*) AS demand_count FROM job_postings_fact AS j INNER JOIN skills_job_dim AS sj ON j.job_id = sj.job_id WHERE j.job_title_short ILIKE 'Data Analyst' GROUP BY sj.skill_id, j.job_title_short ) SELECT sd.skill_id, s.skills AS skill_name, sd.demand_count, hps.avg_salary FROM skill_demand AS sd INNER JOIN highest_paying_skills AS hps ON sd.skill_id = hps.skill_id AND sd.job_title_short = hps.job_title INNER JOIN skills_dim AS s ON sd.skill_id = s.skill_id ORDER BY sd.demand_count DESC, hps.avg_salary DESC;
| # | Skill | Demand Count | Avg Salary | Optimality Score |
|---|