Transparentcalifornia provides salary and pension information for all the state employees. I am going to download this data into an Oracle database first and then to 'R' to analyze and generate some interesting graphs.
Let us first look at the school district data. We have created a table "SCHOOL_DISTRICT" with the following columns as shown below. We will import the school district data into the school_district table and run some queries on that.
JobTitle |
BasePay |
OverTimePay |
OtherPay |
Benefits |
TotalPay |
TotalPay_Benefits |
School_year | Agency |
TEACHER | 86817.53 | 0 | 0 | 26158.91 | 86817.53 | 112976.44 | 2014 | ABC Unified |
TEACHER | 89501.99 | 0 | 5761 | 16385.89 | 95262.99 | 111648.88 | 2014 | ABC Unified |
Let us see which position earns the top salary in each school district.
There are 537 distinct school districts in California,so I will pick a few school districts and run the query
SELECT sal.jobtitle,sal.agency,sal.totalpay_benefits FROM (
SELECT jobtitle, to_char(totalpay_benefits, 'fm9999999.90')totalpay_benefits, agency,
RANK() OVER (PARTITION BY agency ORDER BY totalpay_benefits desc)"Rank"
FROM school_district
WHERE agency IN('Antioch Unified','Bakersfield City School District',
'TurlockUnified', 'San Francisco Unified', 'Los Angeles Unified', 'Mountain View
Whisman' ))sal
'TurlockUnified', 'San Francisco Unified', 'Los Angeles Unified', 'Mountain View
Whisman' ))sal
WHERE sal."Rank" < 2
ORDER BY sal.totalpay_benefits DESC
Comparison of Salaries among different School Districts |
Let us look at the top ten school employees in California who get paid the highest salary.
SELECT jobtitle, to_char(basepay,'fm9999999.90')basepay, overtimepay, to_char(totalpay_benefits,'fm9999999.90')totalpay_benefits, agency FROM
(SELECT jobtitle,basepay,overtimepay,totalpay_benefits,agency FROM school_district
ORDER BY totalpay_benefits DESC)
WHERE rownum < 11
Let us also look at the teachers who are in the top ten in terms of salary in California.
Top Ten Salaries of School Employees in CA |
SELECT jobtitle,to_char(basepay,'fm9999999.90')basepay, overtimepay, otherpay, benefits, to_char(totalpay_benefits,'fm9999999.90')totalpay_benefits, agency FROM
(SELECT jobtitle, basepay, overtimepay, totalpay_benefits, agency, otherpay, benefits FROM school_district
WHERE jobtitle like 'Teacher%'
ORDER BY totalpay_benefits DESC)
WHERE rownum < 11
(SELECT jobtitle, basepay, overtimepay, totalpay_benefits, agency, otherpay, benefits FROM school_district
WHERE jobtitle like 'Teacher%'
ORDER BY totalpay_benefits DESC)
WHERE rownum < 11
Top Ten Salaries of Teachers in CA |
I think by now, you must have gotten an idea of how to find the top ten values in a given data set. You can download this data from transparentcalifornia.com and try these queries and much more on your own.
We will load this data in 'R' and generate some cool graphs in my next post.