sql - Distinct on one column only in Oracle -
i use distinct on following table, on 'playerid' column. have @ moment:
matchid playerid teamid matchdate startdate ---------- ---------- ---------- --------- --------- 20 5 2 14-jan-12 01-jun-11 20 5 4 14-jan-12 01-jun-10 20 7 4 14-jan-12 01-jun-11 20 7 2 14-jan-12 01-jun-10 20 10 4 14-jan-12 01-jun-11 20 11 2 14-jan-12 01-jun-10 20 13 2 14-jan-12 01-jun-11 20 16 4 14-jan-12 01-jun-10 20 17 4 14-jan-12 01-jun-10 20 18 4 14-jan-12 01-jun-10 20 19 2 14-jan-12 01-jun-11
and want, highest 'startdate' each 'playerid' shown , next row ignored:
matchid playerid teamid matchdate startdate ---------- ---------- ---------- --------- --------- 20 5 2 14-jan-12 01-jun-11 20 7 4 14-jan-12 01-jun-11 20 10 4 14-jan-12 01-jun-11 20 11 2 14-jan-12 01-jun-10 20 13 2 14-jan-12 01-jun-11 20 16 4 14-jan-12 01-jun-10 20 17 4 14-jan-12 01-jun-10 20 18 4 14-jan-12 01-jun-10 20 19 2 14-jan-12 01-jun-11
current sql:
select pi.matchid, pi.playerid, t.teamid, m.matchdate, pf.startdate plays_in pi, match m, plays_a pa, team t, plays_for pf, made_up_of muo, season s pi.matchid = m.matchid , m.matchid = pa.matchid , pa.teamid = t.teamid , pf.playerid = pi.playerid , pf.teamid = t.teamid , muo.matchid = pi.matchid , muo.seasonid = s.seasonid , pi.matchid = '&match_id' , m.matchdate >= pf.startdate order pi.matchid asc, pi.playerid asc, pf.startdate desc;
it's oracle database.
thanks in advance.
a few points...
unless you're using joins
made_up_of
,season
filter out rows, don't need these tables. i've left them out here; can add them in if need them.mark tickner correct should use ansi join syntax. nice thing (other being standard) puts join logic right table being joined. once used think you'll find preferable.
what you're after maximum
pf.startdate
eachplayerid
, nice fit analyticalrow_number()
function.partition pi.playerid order pf.startdate desc
assign value1
row each player's recent sort date. outer filters out rows except1
ranking.you can assign rankings
rank()
,dense_rank()
analytical functions, if player has tie recent date tied dates ranked #1 , you'll multiple rows player. in situations want 1 row per player, userow_number()
instead.
put , this:
select matchid, playerid, teamid, matchdte, startdate ( select pi.matchid, pi.playerid, t.teamid, m.matchdate, pf.startdate, row_number() on (partition pi.playerid order pf.startdate desc) startdaterank plays_in pi inner join match m on pi.matchid = m.matchid inner join plays_a pa on m.matchid = pa.matchid inner join team t on pa.teamid = t.teamid inner join plays_for pf on pf.playerid = pi.playerid , pf.teamid = t.teamid pi.matchid = '&match_id' , m.matchdate >= pf.startdate ) startdaterank = 1 order matchid, playerid
one final point: based on where pi.matchid = '&match_id'
looks may using php front end , mysql
functions query. if so, please mysqli
or pdo
instead, they'll protect sql injection. mysql
functions (which officially deprecated) not.
addendum: more information row_number
, many @andriym.
with row_number
, if player has more 1 row recent date, 1 of rows assigned row_number = 1
, , row picked more or less randomly. here's example, player's recent date 5/1/2013 , player has 3 rows date:
pi.matchid pi.playerid pf.startdate ---------- ----------- ------------ 100 1000 05/01/2013 <-- row_number = 1 101 1000 04/29/2013 105 1000 05/01/2013 <-- row_number = 1 102 1000 05/01/2013 <-- row_number = 1 107 1000 04/18/2013
note one of rows above assigned row_number = 1
, , it can of them. oracle decide, not you.
if uncertainty problem, order additional columns clear winner. example, highest pi.matchid
used determine "true" row_number = 1
:
-- replace `row_number...` in query above this: row_number() on ( partition pi.playerid order pf.startdate desc, pi.matchid desc) startdaterank
now if there's tie highest pf.startdate
, oracle looks highest pi.matchid
within subset of rows highest pf.startdate
. turns out, 1 row satisfies condition:
pi.matchid pi.playerid pf.startdate ---------- ----------- ------------ 100 1000 05/01/2013 101 1000 04/29/2013 105 1000 05/01/2013 <-- row_number = 1: highest matchid -- recent startdate (5/1/2013) 102 1000 05/01/2013 107 1000 04/18/2013 <-- not considered: has highest matchid isn't -- in subset recent startdate
Comments
Post a Comment