Advanced SQL Challenge


The Challenge

Here is a challenge for SQL enthusiasts.
I'll solve it here using PostgreSQL 10.10 on Kubuntu 18.04; but feel free to give it a try in your favorite RDBMS.
We have a simple table with two columns : Column dt is a date, and column rt is an exchange rate, showing real values of USD/TRY from July/August 2018. That's a period where this rate hits a peak.


The table contains data between 16 July and 17 August 2018. But it's far from being complete. Actually it contains only the Tuesdays and Thursdays, plus the first day (16 July) and the last day (17 August).
Our goal is to calculate substitute values for each of these missing days. The result dataset shall contain all dates between 16 July and 17 August 2018.




usd_tr_rates table contents 



dt rt
07/16/18 4.8516
07/17/18 4.8525
07/19/18 4.8389
07/24/18 4.7933
07/26/18 4.8342
07/31/18 4.9161
08/02/18 5.0671
08/07/18 5.2808
08/09/18 5.4167
08/14/18 6.5681
08/16/18 5.8174
08/17/18 6.0142



Load Source Table Data

Hereunder you can find DDL/DML statements to prepare source data in PostgreSQL. At least the insert statements should work for other databases as well.


CREATE DATABASE olric;
\c olric;
CREATE TABLE usd_tr_rates (dt date, rt decimal(18,8));
insert into usd_tr_rates (dt, rt) values ('16/7/2018', 4.8516);
insert into usd_tr_rates (dt, rt) values ('17/7/2018', 4.8525);
insert into usd_tr_rates (dt, rt) values ('19/7/2018', 4.8389);
insert into usd_tr_rates (dt, rt) values ('24/7/2018', 4.7933);
insert into usd_tr_rates (dt, rt) values ('26/7/2018', 4.8342);
insert into usd_tr_rates (dt, rt) values ('31/7/2018', 4.9161);
insert into usd_tr_rates (dt, rt) values ('2/8/2018', 5.0671);
insert into usd_tr_rates (dt, rt) values ('7/8/2018', 5.2808);
insert into usd_tr_rates (dt, rt) values ('9/8/2018', 5.4167);
insert into usd_tr_rates (dt, rt) values ('14/8/2018', 6.5681);
insert into usd_tr_rates (dt, rt) values ('16/8/2018', 5.8174);
insert into usd_tr_rates (dt, rt) values ('17/8/2018', 6.0142);


Required Output

Your SQL shall generate the following output.


dt_newrt_new
2018-07-164.85160000000000000000
2018-07-174.85250000000000000000
2018-07-184.84570000000000000000
2018-07-194.83890000000000000000
2018-07-204.82978000000000000000
2018-07-214.82066000000000000000
2018-07-224.81154000000000000000
2018-07-234.80242000000000000000
2018-07-244.79330000000000000000
2018-07-254.81375000000000000000
2018-07-264.83420000000000000000
2018-07-274.85058000000000000000
2018-07-284.86696000000000000000
2018-07-294.88334000000000000000
2018-07-304.89972000000000000000
2018-07-314.91610000000000000000
2018-08-014.99160000000000000000
2018-08-025.06710000000000000000
2018-08-035.10984000000000000000
2018-08-045.15258000000000000000
2018-08-055.19532000000000000000
2018-08-065.23806000000000000000
2018-08-075.28080000000000000000
2018-08-085.34875000000000000000
2018-08-095.41670000000000000000
2018-08-105.64698000000000000000
2018-08-115.87726000000000000000
2018-08-126.10754000000000000000
2018-08-136.33782000000000000000
2018-08-146.56810000000000000000
2018-08-156.19275000000000000000
2018-08-165.81740000000000000000



Solution (PostgreSQL)

My solution in PostgreSQL is as follows.
with rates1 as
    (
    select dt, rt,
        max(dt) over (partition by 1 order by dt
            rows between 1 following and 1 following) next_dt,
        max(rt) over (partition by 1 order by dt
            rows between 1 following and 1 following) next_rt
    from usd_tr_rates
    ),
    rates2 as
    (
    select dt, rt, next_dt, next_rt,
        next_dt - dt multip
    from rates1
    ),
    series as
    (
    select * from generate_series(1,10)
    )
select
    rates2.dt+series.generate_series-1 dt_new,
    rates2.rt+((series.generate_series-1) * ((next_rt - rt)/multip)) rt_new
from
    rates2
join
    series
on
    series.generate_series<=rates2.multip
order by
    dt, generate_series;

No comments:

Post a Comment