2008 Implementation Workshop
From Madagascar
Contributors to the Madagascar Project are invited to convene during May 23-27 in Golden, Colorado for the
Contents |
Madagascar 2008 Implementation Workshop: Towards full automation and better robustness
The participants will work together on implementing features in Madagascar. This will not be an academic conference, but an active coding sprint!
The "deliverables" of this meeting are:
- A stronger geophysical open-source community
- A short-term Madagascar road map, created by discussing, extending, and prioritizing the feature request list
- A real move towards version 1.0
Agenda
- Fri May 23
- 11-12 AM: computer setup, building orientation for first-time visitors, informal conversations, etc.
- 12:00-13:00: Lunch
- 13:00-13:15 AM: Sergey Fomel: A vision of Madagascar towards version 1.0 and beyond
- 13:15 AM - 15:00: Everybody: Deciding the list of features to be worked on
- 15:00: start of actual implementation work
- Tue May 27
- Finalization of implementation work
- Subject to participant interest, an afternoon hiking trip followed by dinner
For participants sponsored by their employer, it is worth noting that the coding sprint will be taking place over the 3-day Memorial Day weekend. A very advantageous situation could be attained if your management agrees to let you swap this period for the Wed, Thu and Fri following the meeting. This means you can get a five-day weekend in the Rocky Mountains!!
Location
- The workshop will take place at the Geophysics Department of Colorado School of Mines, in Golden, Colorado.
- A meeting venue with internet connection will be kindly provided by the Center for Wave Phenomena, its location to be announced at a later time.
Registration and materials
This is a meeting of developers, not a user meeting. Prospective participants who are not current contributors are welcome if they can explain how they would like to contribute.
Please drop Paul Sava a line by Thursday, May 15, to let him know that you will be coming, so that a meeting room of appropriate size can be arranged for. Please let him know whether you can bring your own laptop. Ideally as little time as possible should be spent on logistics setup, and as much time as possible should go towards coding.
Costs and funding
The Madagascar project will neither request any fees from the participants, nor can it provide funding.
Accommodation and transportation
Hotels within walking distance from the CSM campus include Table Mountain Inn, The Golden Hotel and Dove Inn Bed & Breakfast. Other nearby hotels include:
- Hampton Inn Denver West (3 miles)
- Courtyard by Mariott Denver West (3.5 miles)
- Residence Inn, 14600 W 6th Ave, Frontage Road, Golden, CO, 303-271-0909 (3.5 miles)
- TownePlace Suites Lakewood, 800 Tabor Street, Golden, CO, 303-232-7790 (5 miles)
- Fairfield Inn Lakewood, 11907 W 6the Ave, Golden, CO, 303-231-9939 (5 miles)
- Sheraton Denver West (5 miles)
- Hampton Inn Denver West/Federal Center (5.1 miles)
- Homewood Suites Denver West/Lakewood (5.2 miles)
Super Shuttle provides reliable ground transportation to Golden from Denver International Airport with advance, prepaid reservations. The current round-trip cost is $68.
Pool of features (preliminary)
The m8r features under consideration can be grouped into several categories, which are listed below. The grouping attempts to describe the software equivalent of Maslow's pyramid of needs. First the basics (reproducibility, I/O, parallelization, graphics) should be consolidated, then efforts should move up to numerical tools like solvers, FFTs and transposes, then up to widely-used geophysical algorithms. Interested parties are invited to brainstorm below!
- Features that provide functionality that is needed in order to have a minimalistic fully-automated m8r project setup
- Vplot diffs: These would allow the m8r project to fulfill one of its main goals – having fully automatic regression tests. See Feature request tracker
- rsfbook completion, also from the Feature request tracker
- Automatically sync-ing the Wiki "Guide to programs" with the self-doc;
- Saving a static copy of the wiki so that consulting important parts of the documentation does not require a centralized, brittle software stack (local internet connectivity + remote web server + php + SQL + Mediawiki subject to constant spam attacks) to be running at that very moment. That would also allow splitting the "Guide to Programs" into one-page-per-program files, to be concatenated together into one searchable page in the static copy of the wiki, and for user-contributed parts to be included in the HTML self-doc;
- Features that make m8r more user-friendly
- A sane configuration script for the tex2pdf reproducible paper engine, that pulls a complete list of missing dependencies and has a way of extracting them from TeX Live (or instructing the user to do so). Right now the huge size of TeX Live (>1Gb) and its fast pace of advancement precludes its installation on legacy systems (i.e. RHEL 2,3,4). The goal is to make the pdf paper generating mechanism as easy to install as the rest of m8r, i.e. configuration either fails with a helpful error message or everything installs fine and it Just Works.
- sfdatadoc – see Feature request tracker
- Go through all the programs to make sure that there are no undocumented or unclearly specified parameters. Also look at inconsistencies (when parameters with similar meaning get different names in different programs). Establish and implement coding conventions.
- Binary packages – see Feature request tracker (Note 1: there has been noise since 2006 on SCons developer mail lists about providing a SCons command that would automatically create RPMs and maybe debs as well. Anyone know the implementation status of that? Note 2: Packages are typically split into a "package core" providing executables and help for them and a "package-devel" providing a Software development kit consisting of development libraries, headers and other include files, API documentation, etc. Doing things this way would require deep changes of the current Madagascar build mechanisms)
- man pages – see Feature request tracker
- Decide how to ensure that the Task-centric program list is always sync-ed with the source code. (keywords picked up by rsfdoc?)
- Create "Migrant's dictionary" for newcomers from other packages, by using their task-centric pages such as this one to show which m8r programs correspond to which programs from their "home country"
- TKSU-based GUI – see Feature request tracker (Note: the Madagascar plugin for OpendTect is also on track)
- Features that extend m8r's capabilities
- M8r-based programming
- Java API (more details on the GSOC2008 page)
- Extending the Python interface (more details on the GSOC2008 page)
- Finishing the Octave interface
- A tool to convert "normal" rsf to a "transfer-ready" rsf that has the binary converted to portable XDR, then gzipp-ed, the number of bytes of the compressed file and the MD5sum of the binary and of the compressed binary in the ASCII header, and this is shown by a keyword like form=xdr.gz . I checked with various types and sizes of data and even in the worst-case scenario, a factor of 2 in compression is attained. This would allow safe, fully-automatic data transfers.
- Introduce a sane way to control the optimization level for all languages in a m8r build, using a single flag. A researcher using Madagascar should not be an expert in compiler usage, he should just set the level. For example:
- Level A, have all warnings and debugging info turned on. Link to rsflib version compiled similarly.
- Level B, compile with -O2 and link against rsflib version compiled similarly.
- Level C, same as level B, and also detect if compiler has interprocedural optimization abilities. If yes, compile everything with appropriate flags. If compiler cannot deal with optimizing across multiple files, have Python feed to compiler concatenated source code files that make a single program instead of "include" statements.
- Level D, same as level C, and also have a special SCons "Optiflow" rule type, in which programs to be run are actually recompiled with parameters hardwired into them, so that optimizations such as dead branch elimination and loop unrolling are feasible. Use SCons to perform compilations in parallel. A fully-optimized compilation of most of m8r for a single flow may seem like overkill, but is necessary in the case of huge datasets that take weeks and months to process.
- Level E, same as level D, but compile with -O3 and also, where available, use special alternative versions of the codes optimized by hand and with less error checking.
- Graphics
- Bezier curves in vplot – see Feature request tracker
- xtpen antialiasing – see Feature request tracker
- bargraph – see Feature request tracker
- graph3 completion – see Feature request tracker
- Geophysical/numerical tools
- Harlan's CG – see Feature request tracker
- conjgrad extensions – see Feature request tracker
- kirmod in layered media – see Feature request tracker
- minidds – see Feature request tracker
- M8r-based programming


