sql - Distinct on one column only in Oracle -
i use distinct on following table, on 'playerid' column. have @ moment:
matchid playerid teamid matchdate startdate ---------- ---------- ---------- --------- --------- 20 5 2 14-jan-12 01-jun-11 20 5 4 14-jan-12 01-jun-10 20 7 4 14-jan-12 01-jun-11 20 7 2 14-jan-12 01-jun-10 20 10 4 14-jan-12 01-jun-11 20 11 2 14-jan-12 01-jun-10 20 13 2 14-jan-12 01-jun-11 20 16 4 14-jan-12 01-jun-10 20 17 4 14-jan-12 01-jun-10 20 18 4 14-jan-12 01-jun-10 20 19 2 14-jan-12 01-jun-11 and want, highest 'startdate' each 'playerid' shown , next row ignored:
matchid playerid teamid matchdate startdate ---------- ---------- ---------- --------- --------- 20 5 2 14-jan-12 01-jun-11 20 7 4 14-jan-12 01-jun-11 20 10 4 14-jan-12 01-jun-11 20 11 2 14-jan-12 01-jun-10 20 13 2 14-jan-12 01-jun-11 20 16 4 14-jan-12 01-jun-10 20 17 4 14-jan-12 01-jun-10 20 18 4 14-jan-12 01-jun-10 20 19 2 14-jan-12 01-jun-11 current sql:
select pi.matchid, pi.playerid, t.teamid, m.matchdate, pf.startdate plays_in pi, match m, plays_a pa, team t, plays_for pf, made_up_of muo, season s pi.matchid = m.matchid , m.matchid = pa.matchid , pa.teamid = t.teamid , pf.playerid = pi.playerid , pf.teamid = t.teamid , muo.matchid = pi.matchid , muo.seasonid = s.seasonid , pi.matchid = '&match_id' , m.matchdate >= pf.startdate order pi.matchid asc, pi.playerid asc, pf.startdate desc; it's oracle database.
thanks in advance.
a few points...
unless you're using joins
made_up_of,seasonfilter out rows, don't need these tables. i've left them out here; can add them in if need them.mark tickner correct should use ansi join syntax. nice thing (other being standard) puts join logic right table being joined. once used think you'll find preferable.
what you're after maximum
pf.startdateeachplayerid, nice fit analyticalrow_number()function.partition pi.playerid order pf.startdate descassign value1row each player's recent sort date. outer filters out rows except1ranking.you can assign rankings
rank(),dense_rank()analytical functions, if player has tie recent date tied dates ranked #1 , you'll multiple rows player. in situations want 1 row per player, userow_number()instead.
put , this:
select matchid, playerid, teamid, matchdte, startdate ( select pi.matchid, pi.playerid, t.teamid, m.matchdate, pf.startdate, row_number() on (partition pi.playerid order pf.startdate desc) startdaterank plays_in pi inner join match m on pi.matchid = m.matchid inner join plays_a pa on m.matchid = pa.matchid inner join team t on pa.teamid = t.teamid inner join plays_for pf on pf.playerid = pi.playerid , pf.teamid = t.teamid pi.matchid = '&match_id' , m.matchdate >= pf.startdate ) startdaterank = 1 order matchid, playerid one final point: based on where pi.matchid = '&match_id' looks may using php front end , mysql functions query. if so, please mysqli or pdo instead, they'll protect sql injection. mysql functions (which officially deprecated) not.
addendum: more information row_number, many @andriym.
with row_number, if player has more 1 row recent date, 1 of rows assigned row_number = 1, , row picked more or less randomly. here's example, player's recent date 5/1/2013 , player has 3 rows date:
pi.matchid pi.playerid pf.startdate ---------- ----------- ------------ 100 1000 05/01/2013 <-- row_number = 1 101 1000 04/29/2013 105 1000 05/01/2013 <-- row_number = 1 102 1000 05/01/2013 <-- row_number = 1 107 1000 04/18/2013 note one of rows above assigned row_number = 1, , it can of them. oracle decide, not you.
if uncertainty problem, order additional columns clear winner. example, highest pi.matchid used determine "true" row_number = 1:
-- replace `row_number...` in query above this: row_number() on ( partition pi.playerid order pf.startdate desc, pi.matchid desc) startdaterank now if there's tie highest pf.startdate, oracle looks highest pi.matchid within subset of rows highest pf.startdate. turns out, 1 row satisfies condition:
pi.matchid pi.playerid pf.startdate ---------- ----------- ------------ 100 1000 05/01/2013 101 1000 04/29/2013 105 1000 05/01/2013 <-- row_number = 1: highest matchid -- recent startdate (5/1/2013) 102 1000 05/01/2013 107 1000 04/18/2013 <-- not considered: has highest matchid isn't -- in subset recent startdate
Comments
Post a Comment