3

I am rather new to DevOps. Trying to organize my project for continuous integration (CI), I am facing a problem for which I haven't been able to find a solution yet. I would like to know what is the best way to manage my database file(s) in the setup that I have so that the latest version of DB can be accessed by developers as well as Jenkins.

Similar questions have been touched on in various threads, such as the ones below. But none answered the question specific to my situation.

Setup

Here is what my setup looks like:

  • Language: Python 3.5 (with unittests, etc.)
  • Database: MySQL - the data updates once a day (not the schema)
  • Version Control: Git
  • Code Hosting: Bitbucket
  • Development OS: Windows 10
  • CI Machine: Amazon AWS EC2 running Linux (Ubuntu 16.04.1)
  • CI Tool: Jenkins

I have set up the CI such that whenever I update origin/master, Jenkins runs the tests on the AWS machine.

EDIT To clarify what my database is, suppose it is some kind of daily sales figures or daily market update. I have a code to update the db once a day and another program to perform some sort of data analytics on it.

Problem

I am not concerned about the version control of the database. In other words, I don't care what my DB was two days ago. However, since my database will be updated every day, I want different dev systems as well as Jenkins to access the latest every time a test is run. From what I have read so far, it is not good practice to include the database in my Git repository (and I wouldn't want to do that since I don't care about tracking changes on it). So the question is how should I store my database and integrate it into my project?

Some have suggested using Dropbox or other tools to access the DB anywhere. Is this the best approach? If yes, what exactly should be kept on Dropbox? Actual database (*.db) files and folders? In MySQL those are kept centrally in the /data/ folder of MySQL and I have no idea how to store them on another folder such as Dropbox.

Others suggested keeping a dump of the DB. What should the process be? To generate the dump after the daily change of the database and keep the .txt dump files on Dropbox? How would Jenkins re-create the DB from these dump files every time?

7
  • Did I understand correctly that you somewhere have a production/master/live database, and you want to have your development db server to have updated data every morning? Commented Aug 26, 2016 at 18:36
  • I think the answer is 'Yes'. Just edited to answer your question: My DB could be some kind of daily sales figures or daily market update. I have a code to update the db once a day and another program to perform some sort of data analytics on it.
    – hANI
    Commented Aug 26, 2016 at 18:55
  • I still don't understand the question; if you have a database server, with data on it, what more do you need? (I probably never had similar use case so I don't get it :) Commented Aug 26, 2016 at 19:06
  • Right now, my database is kept locally (a local MySQL server) on the dev machine. There is no server that all the users including Jenkins CI can access. My question is how should I sync the database across all the machines, so once it is updated on a dev machine, then Jenkins, etc. also will have the latest update. At the moment I am not keeping it in the source repo which I keep sync'ed all across. I hope that clarifies it.
    – hANI
    Commented Aug 26, 2016 at 20:14
  • I see. Is there a reason not to have one central dev database server? User rights can be configured if not everyone should have write permissions, and also anyone could simply make a local copy in the morning if they need to, Commented Aug 26, 2016 at 20:29

1 Answer 1

0

I wouldn't move the actual data around. If it's possible, and as long as it meets your security needs, I would just make the connection to your MySQL server accessible from wherever you are doing development or testing, and then modify the connection string to point to that server.

3
  • I am still not sure where I keep the actual database? On another repository or on a secure hosting and give access to all the users, including Jenkins?
    – hANI
    Commented Aug 26, 2016 at 18:58
  • It's hard to say what works best for you. I would probably host it on a secure VM/Machine by itself, and give the Jenkins server and your development machines access to it.
    – James Rice
    Commented Aug 28, 2016 at 17:32
  • Now, after reading your comments above about your limitations (can't comment there because I don't have the reputation needed, and I can't edit my comments after 5 minutes?). I agree with Dusan, Put it on your Jenkins CI machine.
    – James Rice
    Commented Aug 28, 2016 at 17:41

Not the answer you're looking for? Browse other questions tagged or ask your own question.