1

While testing ansible, I stumbled over an issue with big files and ansible.

My ansible directory for testing looks something like this:

ansible
|- ansible-testing
|  |- .git
|  |- big-file-playbook.yml
|- storage
   |- bigfile.iso

This works but since the iso is around 12 GB I think I run into network problems if my testmachine provides 12 GB to 30+ machines.

So my idea is to use network filesystem (NFS) which is temporary mounted while I run my playbook.

I wonder if this might reduce network traffic. I have no clue if NFS has this possibility to read partially from the iso, which is mounted.

The idiotic thing, the download of the iso is restricted to very few people, who provide me the iso and I create another download instance for this iso.

I know that I could host a ftps server for this with enough bandwidth, but the idea of a lazy loading iso seems nice.

Edit: Another Idea of a coworker was to use local BitTorrent.

1 Answer 1

2

I'd propose to 1) transfer the files as a standalone task and 2) test the consistency of the files before using them.

  1. There are many options. For example, given the inventory for testing:
shell> cat hosts
[test]
test_01
test_02
test_03

[test:vars]
ansible_user=admin
ansible_become=yes
ansible_python_interpreter=/usr/local/bin/python3.8
ansible_perl_interpreter=/usr/local/bin/perl
  • Simply rsync the file to the remote hosts
shell> rsync storage/bigfile.iso admin@test_01:/tmp
...
  • You can write a script
shell> cat rsync_big_file_test.bash
#!/usr/bin/bash
rhosts=$(ansible-inventory --list | jq -r '.test.hosts[]')
for host in $rhosts; do
    query="._meta.hostvars.${host}.ansible_user"
    ansible_user=`ansible-inventory --list | jq -r $query`
    echo "Syncing storage/bigfile.iso to ${ansible_user}@${host}:/tmp ..."
    rsync storage/bigfile.iso ${ansible_user}@${host}:/tmp
done

Read the inventory and iterate the synchronization in serial

shell> ./rsync_big_file_test.bash
Syncing storage/bigfile.iso to admin@test_01:/tmp ...
Syncing storage/bigfile.iso to admin@test_02:/tmp ...
Syncing storage/bigfile.iso to admin@test_03:/tmp ...

Advanced scripting in parallel is up to you.

- hosts: all
    
  tasks:

    - ansible.posix.synchronize:
        src: storage/bigfile.iso
        dest: /tmp

gives (abridged)

TASK [ansible.posix.synchronize] *************************************************************
changed: [test_03]
changed: [test_02]
changed: [test_01]

See Controlling playbook execution: strategies and more. By default, Ansible runs 5 forks in parallel. It's up to you to test your bandwidth and fit the number of transfers running in parallel.

  1. For example, get the hash
shell> sha256sum storage/bigfile.iso 
15802a9db4d3e72e1a19e053bca613ca1d5236638cf66da9dabdb578dd5ac6a2  storage/bigfile.iso

The below play will test the files

shell> cat big-file-playbook.yml
- hosts: all

  vars:

    bigfile_iso_sha256: '15802a9db4d3e72e1a19e053bca613ca1d5236638cf66da9dabdb578dd5ac6a2'
    
  tasks:

    - stat:
        path: /tmp/bigfile.iso
        checksum_algorithm: sha256
      register: out
    - debug:
        var: out.stat.checksum

    - assert:
        that: out.stat.checksum == bigfile_iso_sha256
        fail_msg: '[ERR] Restore bigfile.iso.'

    - debug:
        msg: The play is running ...
3
  • That's a brilliant approach and worked nice. I am looking forward to your Opinions with the WIP.
    – MaKaNu
    Commented Jan 11 at 15:34
  • I added a couple of options. Commented Jan 11 at 19:04
  • Okay yes had those options in mind. is there anything against setting it up with a local BitTorrent server?
    – MaKaNu
    Commented Jan 12 at 12:11

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .