jdbc - clojure sqlkorma library: out of memory error -
i'm doing thought straightforward task: run sql query (over 65k rows of data) using sqlkorma library (http://sqlkorma.com), , each row transforming in way, , writing csv file. don't think 65k rows large given have 8gb laptop, assumed sql result set lazily fetched , whole thing never held in memory @ same time. really surprised when ended stack trace:
exception in thread "main" java.lang.outofmemoryerror: java heap space @ clojure.lang.persistenthashmap$bitmapindexednode.assoc(persistenthashmap.java:777) @ clojure.lang.persistenthashmap.createnode(persistenthashmap.java:1101) @ clojure.lang.persistenthashmap.access$600(persistenthashmap.java:28) @ clojure.lang.persistenthashmap$bitmapindexednode.assoc(persistenthashmap.java:749) @ clojure.lang.persistenthashmap$transienthashmap.doassoc(persistenthashmap.java:269) @ clojure.lang.atransientmap.assoc(atransientmap.java:64) @ clojure.lang.persistenthashmap.create(persistenthashmap.java:56) @ clojure.lang.persistenthashmap.create(persistenthashmap.java:100) @ clojure.lang.persistentarraymap.createht(persistentarraymap.java:61) @ clojure.lang.persistentarraymap.assoc(persistentarraymap.java:201) @ clojure.lang.persistentarraymap.assoc(persistentarraymap.java:29) @ clojure.lang.rt.assoc(rt.java:702) @ clojure.core$assoc.invoke(core.clj:187) @ clojure.core$zipmap.invoke(core.clj:2715) @ clojure.java.jdbc$resultset_seq$thisfn__204.invoke(jdbc.clj:243) @ clojure.java.jdbc$resultset_seq$thisfn__204$fn__205.invoke(jdbc.clj:243) @ clojure.lang.lazyseq.sval(lazyseq.java:42) @ clojure.lang.lazyseq.seq(lazyseq.java:60) @ clojure.lang.cons.next(cons.java:39) @ clojure.lang.persistentvector.create(persistentvector.java:51) @ clojure.lang.lazilypersistentvector.create(lazilypersistentvector.java:31) @ clojure.core$vec.invoke(core.clj:354) @ korma.db$exec_sql$fn__343.invoke(db.clj:203) @ clojure.java.jdbc$with_query_results_star_.invoke(jdbc.clj:669) @ korma.db$exec_sql.invoke(db.clj:202) @ korma.db$do_query$fn__351.invoke(db.clj:225) @ clojure.java.jdbc$with_connection_star_.invoke(jdbc.clj:309) @ korma.db$do_query.invoke(db.clj:224) @ korma.core$exec.invoke(core.clj:474) @ db$query_db.invoke(db.clj:23) @ main$_main.doinvoke(main.clj:32) @ clojure.lang.restfn.applyto(restfn.java:137)
as far can tell stack, has not made outside query code (meaning hasn't reached transformation/write csv code @ all). if matters, sql straightforward, select * my_table some_id not null , rownum < 65000 order some_id asc
. oracle (to explain rownum above), don' think matters.
edit:
code sample:
(defmacro query-and-print [q] `(do (dry-run ~q) ~q)) (defn query-db [] (query-and-print (select my_table (where (and (not= :my_id "bad data") (not= :my_id nil) (raw (str "rownum < " rows)))) (order :my_id :asc)))) ; args contains rows 65000, , configure-app sets jdbc ; connection string, , sets var rows value (defn -main [& args] (when (configure-app args) (let [results (query-db) dedup (dedup-with-merge results)] (println "result size: " (count results)) (println "dedup size: " (count dedup)) (to-csv "target/out.csv" (transform-data dedup)))))
clojure.java.sql
creates lazy sequences:
(defn resultset-seq "creates , returns lazy sequence of maps corresponding rows in java.sql.resultset rs. based on clojure.core/resultset-seq respects current naming strategy. duplicate column names made unique appending _n before applying naming strategy (where n unique integer)." [^resultset rs] (let [rsmeta (.getmetadata rs) idxs (range 1 (inc (.getcolumncount rsmeta))) keys (->> idxs (map (fn [^integer i] (.getcolumnlabel rsmeta i))) make-cols-unique (map (comp keyword *as-key*))) row-values (fn [] (map (fn [^integer i] (.getobject rs i)) idxs)) rows (fn thisfn [] (when (.next rs) (cons (zipmap keys (row-values)) (lazy-seq (thisfn)))))] (rows)))
korma realizes sequence dropping each row vector:
(defn- exec-sql [{:keys [results sql-str params]}] (try (case results :results (jdbc/with-query-results rs (apply vector sql-str params) (vec rs)) :keys (jdbc/do-prepared-return-keys sql-str params) (jdbc/do-prepared sql-str params)) (catch exception e (handle-exception e sql-str params))))
Comments
Post a Comment