Getting just the tip of a remote git branch

June 7th, 2011

As projects move their code under git control, people get frustrated about being unable to do most basic operations they are used to perform with SVN or CVS. That’s a fact, so let’s see if I can relief some pain by sharing what I know or learn as I crawl the learning curve myself.

Yesterday I’ve met with Markus Neteler and he was complaining about being unable to checkout the release branch of QuantumGIS without filling up his laptop hard drive. He got me curious, so here are some numbers and a recipe.

An SVN checkout as of April 30  is 281 Mb in size, 1 of which being the .svn directory.

A full git clone at time of writing (June 7) is 330 Mb  in size, 200 of which being the .git database and  130 being the working copy (the “checkout”).

A full git clone contains all the data available in the original repository. Once you get the clone, you have all the branches and all the history. no need for any more bandwidth.

But Markus was only interested in a single branch, not the whole set, and he wanted no history either. So he could cloned just the objects referenced by the commit known as the release-1_7_0 branch and no further parents (back history). Here’s how you do:

git clone –depth 1 –branch release-1_7_0 git://github.com/qgis/Quantum-GIS.git

The resulting shallow repository (the .git directory) is 110 Mb in size. Add 133 Mb of working directory (yes, release-1_7_0 is 3 Mb bigger than master) for a total of 243 Mb disk space used.

NOTES:
  1. A shallow repository (one with short history) cannot be further cloned, but here are no problems pulling updates from the origin nor producing patches or pushing changes.
  2. If you don’t know in advance the name of the branch you can query it from the remote repository using git ls-remote
  3. Every git command has a manual page in the form: git-command (ie: man git-ls-remote)
Happy learning !

GEOS 3.3.0

May 31st, 2011

GEOS 3.3.0 is out: http://download.osgeo.org/geos/geos-3.3.0.tar.bz2

This release introduces a fair amount of new C-API interfaces and a brand new PHP binding. Full details in the NEWS file: http://trac.osgeo.org/geos/browser/tags/3.3.0/NEWS

As with any release since 3.0.0 there is complete binary compatibility with clients linked against the C-API. These include, but are not limited to, PostGIS. For a list of known clients: http://trac.osgeo.org/geos/wiki/Applications (add yours, if not already listed!)

GEOS is a C++ port of the JTS Topology Suite. This release targets version 1.12 of the library, but doesn’t reach full feature parity yet. Missing JTS functionalities:

  • Densifier class
  • Geometric similarity detection package (HausdorffSimilarityMeasure, AreaSimilarityMeasure)
  • MinimumDiameter.getMiminumRectangle()
  • Triangulation API
  • VoronoiDiagramBuilder
  • createSquircle and createSuperCircle in GeometricShapeFactory
  • MinimumClearance class
  • nearestNeighbours method to STRtree
  • RandomPointsBuilder / RandomPointsInGridBuilder
  • KochSnowflakeBuilder
  • SierpinksiCarpetBuilder

If you’d like to sponsor development of any of the above items (or others) for next feature release of GEOS (3.4.0) please drop me a note.

Software libero e tecnologie partecipative

May 30th, 2011

Lunedi’ 6 Giugno 2011 alle 20:30 presso la sede dell’ associazione Cubibi’ di Ispra si terra’ un’ incontro aperto sul software libero e le tecnologie partecipative.

[ via Madonnina del Grappa, 40-48 - Ispra. 6 Giugno 2011 ore 20:30 ]

L’ incontro e’ rivolto a tutti coloro che coltivano la propria curiosita’ come una risorsa e vogliono scoprire qualcosa in piu’ sugli strumenti elettronici che mediano sempre piu’ le nostre attivita’ personali e sociali estendendone o limitandone le possibilita’, aiutando od ostacolando la nostra capacita’ di scelta.

UPDATE: qui il materiale per stampare il volantino.

PostGIS / GEOS / MapServer with git

April 27th, 2011

I’ve setup git mirrors of PostGIS, GEOS and MapServer SVN repositories updated hourly. You can clone the git repositories and re-attach to the SVN ones with this simple script (untested):

for repo in postgis geos mapserver; do
  git clone git://github.com/strk/${repo}.git
  cd ${repo}
  git svn init http://svn.osgeo.org/${repo}/trunk
  git update-ref refs/remotes/git-svn refs/heads/master
  git svn fetch
  cd -
done

After that you can run git svn rebase to get changes from SVN or git pull to get changes from GIT (may be one hour late)

Happy hacking !

Why Political Liberty Depends on Software Freedom More Than Ever

March 11th, 2011

I’ve spent a few days adding Italian subtitles to a video recording of the speech given by Eben Moglen at FOSDEM 2011 about privacy, freedom and net neutrality.

I would have liked to embed the video right here, but these advanced friendly systems make everything so… ehm… difficult !

Well, take a look at the simple version of it :)

Hack Fever

September 24th, 2010

Got seasonal flu this week, forcing me home… Nothing better for some Gnash hacking !

I had this itch to scratch for degradation of experience in playing the wonderful Winterbells game.

It started in June when I realized that Gnash-0.8.8 could not start the game properly and was suspiciously slower than 0.8.7. In late July I put playback under profile and found out that the performance penalty was introduced by a compatibility fix (property name case in SWF up to 6).

So, with high temperature and kid at school, I resolved to do something about it: first I fixed the startup problem, then I created an objecturi git branch to proof-test caching of lowercase property keys as a mechanism to improve lookups.

Now it’s friday and flu is over, so I need to clean up and put the toys away.

Luckly, we are talking about free  (as in freedom) software, so it’s easy to find friends caring about the toys and happy to play togheter! So I had a chat with Ben resulting in a proper plan, and he agreed to go on playing some more and cleanup afterwards.

Ain’t Free software wonderful ?

Feed the bats

April 22nd, 2010

My bats are gone with the spring, leaving behind what looks like delicious food!

These insects came out from who knows where and filled up the walls of our house.

They do look like Clothing Moths and probably are, but contrary to what wikipedia says they don’t seem to prefer low light as we often see them around light bulbs.

This invasion is mining our will to remain in the house, which is a pity as there’s a wonderful garden out here and flowers are popping out on a daily basis :/

UNICODE from OSM to PGSQL (part 2)

April 5th, 2010

There is no problem importing OSM data into PostgresSQL / PostGIS.

In part one of the article we’ve seen Geofabrik’s shapefiles having a text data truncation problem, but using osm2pgsql everything gets into an UTF-8 database without a failure.

It’s as simple as:

$ osm2pgsql -l -c -S default.style africa.osm.bz2 -d osm

The -l switch aks for keeping lat/long projection, -c requests creation of the schema, -d specifies the database to use. The default.style file is a configuration specifying what to import and how; I used the default for the sake of this test.

Resulting ralations:

 Schema |        Name        |   Type
--------+--------------------+----------
 public | planet_osm_line    | table
 public | planet_osm_point   | table
 public | planet_osm_polygon | table
 public | planet_osm_roads   | table

And the way we’ve been using for testing has all characters:

full multibyte name of way 4005333

Let’s see it in Quantum GIS, compared with the one coming from the corrupted shapefile (which I’ve imported into postgis after hacking shp2pgsql to discard incomplete multibytes):

qgis screenshot

The difference you may notice seems to be due to left-to-right vs. right-to-left orientation of the text. My terminal seems to ignore orientation, qgis doesn’t.

Now, time to see if a shapefile will be able to bear all that UNICODE. Let’s not do anything fancy, just dump the roads table using pgsql2shp:

$ pgsql2shp osm planet_osm_roads

Pretty fast (slightly above 1 second system time, 8 secs real time). And here’s the generated shapefile dataset:

72979148 planet_osm_roads.shp
69555432 planet_osm_roads.dbf
  516268 planet_osm_roads.shx
     257 planet_osm_roads.prj

Do they have the full multibyte strings now ? Sure, shp2pgsql doesn’t complain anymore, and you can safely import into postgis again completing the round-trip. Only you have to specify input encoding UTF-8 as the new default encoding, as I pointed out in previous post, is that unmentionable one… So:

$ shp2pgsql -W UTF-8 planet_osm_roads planet_osm_roads_roundtrip | psql osm
...
$ psql osm -c 'select name from planet_osm_roads_roundtrip where osm_id = 4005333';
 Avenue des Nations Unies - شارع الأمم المتحدة

Also, we can open the shapefile itself with qgis and see how it looks:

Green is the pgsql2shp-exported shapefile, red is osm2pgsql-imported planet osm, black is geofabrik-imported shapefile.

All clean and easy :)

Further excercises would include tweaking the osm2pgsql style file and generally the import process to better select data of interest, properly clean geometry invalidities and taking care of incremental updates of the data.

Good luck and happy hacking !

UNICODE from OSM to PGSQL

April 4th, 2010

This week I’ve been presented with a problem importing OpenStreetMap data of Africa  from GeoFabrik’s shapefile export into a PostgreSQL / PostGIS database.

The problem consisted in a loss of information during the transport, resulting in wrongly encoded strings (road names) ending up in the db. This was during a feasibility study. So, is that feasible ? Let’s take a look.

I downloaded the shapefiles and tried to import the roads one using shp2pgsql with no options, and here’s the result:

Unable to convert field value "Place Othman Ibn Affane ساحة عثمان اِ" to UTF-8:
iconv reports "Invalid or incomplete multibyte or wide character"

Why is shp2pgsql trying to convert, and from which encoding? When I left it, the default was to perform no conversion unless -W was given…

Well, it turns out the default is now to convert from WINDOWS-1252 encoding (why?) and there no way to request no encoding at all (why?!).

So I patched the loader to give more informations about the encoding process and specified UTF-8 as source encoding. Here’s the result:

Unable to convert field value "Avenue des Nations Unies - شارع الأمم " from UTF-8 to UTF-8:
iconv reports "Invalid argument"

So it’s official: the dbf file contains invalid data. The confusing error message (Invalid argument) means the multibyte sequence is incomplete rather than invalid (EINVAL errno).

Adding more debugging code I can see that many many rows have values that look truncated, all ending with a single byte being either <D8> or <D9>.

OpenOffice confirms the malformation (wanted an independent opinion on that just in case it was shapelib doing the truncation):

OpenOffice shows the truncated multibyte value

Querying openstreetmap for way 4005333 shows the full string, and the full string is also present in the .osm file downloaded from geofabrik:

<tag k="name" v="Avenue des Nations Unies - شارع الأمم المتحدة"/>

So the problem is only with the shapefile, not the OSM data itself, nor with postgis.

Surely postgis loader could be tweaked to allow for a tolerance, in case anyone wants to import the truncated data anyway. In this specific case discarding the final partial multibyte string might be the best you can do as it’s a case of truncation as any other, only being multibyte it gives more problems than single-byte encoding.

Timely enough someone submitted a patch aimed at exactly this kind of tolerance handling. I’m going to see how well that’ll cope with this case.

But bottom line is we do want the good data, so this problem is not solved until the data will be in the database, stepping by shapefiles (if possible) or directly.

Next stop: osm2pgsql -> go there

Meet the bats

February 14th, 2010

Batty flying in the sitting room

Since I moved in a new house I got to know bats.

Could be due to the fact the house was inhabitated for some time. Thing is, every now and then a bat or two come out from who-knows-where and start flying around the sitting room.

First time we met it was at late night. Diana woke me up scared. “What’s that thing flying around?!” It took a lot of time and cold before we succeeded at sending it out the window. I was surprised how few intentions he had to get out. Wintertime. Very cold. We surely didn’t keep the windows open during the day. Where did that bat come from ?

We left the house for christmas/new-year and when we came back two bats were flying in the sitting room. This time we’ve been nicer. Also the kid was happy to see them (she’s always wanted pets). This time, they also sometime stopped flying and took some rest against a wall. Finally they disappeared (again after damn cold entering the house).

Then it happened again after leaving the house for jusr a few hours. This time I didn’t handle to let the bat out, but he disappeared anyway. God knows where. We tried to find him looking in every place of the house with no luck. “Well, we’ll see tomorrow”. Gone sleeping. The day after, while I was alone in the house, silently tapping on the computer keyboard, light wings noise I’ve heard, then not much later the bat showed up again flying in circle. This time I took the chance to take some pictures.

Flying bat

The bat didn’t fly away when I tried to pick him with my hands (and a blanket) after it took a rest on the window curtain. So I took the chance to take more pictures, then took him out the window and he flied from my hand.

bat on the curtainbat in the blanket

I start to suspect there’s actually a colony in the house which we’ll need to learn co-habit with. Hopefully animal experts will have some advice for me. Anyone ?