Loading multiple CSV files into MySQL -


i working on metrics project team. have load several different reports central repository , create tables , reports off of data.

the data sources are:

  1. csv files
  2. pdfs
  3. ad-hoc/manual data.

i playing talend , mysql. little confused how load csv files. should have collection of directories , 1 or more scheduled tasks load files?

another thought write custom file processor load file based on naming convention. thoughts?

"pdf" complicated. pdf... "ad-hoc/manual data" needs more details.
if focus on csv , question related guys if i'm right, i'd writing app calls sp in mysql db, handing on full path csv (and additional data, such table's "user friendly name" if needed - or other meta-data you'd store) executes import using mysql load data.
reason is, there can many rules in "business logic" after csv imported, , it's easier maintain app according changing business requirements, changing db behavior time, and, if goes terribly wrong db safe , "import manager app" fails - don't have store neither nor csvs on same system db is.
dbs, relational dbs storing data, , retrieving data rapidly based on 'set theory', not taking care of how data gets system.

think these questions before start implementing anything:

  • what happens csv after processed? can deleted? should moved e.g. "processed" folder? should remain/stay intact?
  • if should stay , was, should know processed file? (set "ready archive" flag, instance? touch "last modified" date , set 1950.01.01? add property file?
  • what should if csv import fails (e.g. invalid data in file, or null value shouldn't have nulls)? display error? mark csv unusable? send e-mail? move "processing_failed" folder?
  • what if file count grows huge in input folder?
  • how can change import/process/etc if business logic changes, or csv format changes?

and on. think through options have , decide.

i hope answered question ;)


Comments

Popular posts from this blog

linux - Does gcc have any options to add version info in ELF binary file? -

android - send complex objects as post php java -

charts - What graph/dashboard product is facebook using in Dashboard: PUE & WUE -