Ming the Mechanic
The NewsLog of Flemming Funch

Monday, August 16, 2004day link 

 Working
I'm not getting much posted here right now. I'm in a streak where I get a lot of programming done. Working on my OrgSpace system. Integrated a bunch of my modules a good deal better, in a unified system. And now I'm reviving a content management system I did some years ago. Will be pretty cool when I'm done. It isn't great for my sanity to work in this kind of mode, where I just work on the same thing for all day, but I do get things done. And right now all the rest of my family is out of town, so I've got peace and quiet for once. Well, I'll follow them to Denmark in a few days, but I'll get the most out of it when they're not here.
[ | 2004-08-16 14:33 | 1 comment | PermaLink ]  More >

 Standard Data Sources
I occasionally have the problem of trying to figure out which is the most authoritative source of some type of data, and that usually isn't easy, and not much automation is available.

So, for example, I'm adding a list of languages to a program. I'd like it to be a standard list, using standard codes. OK, I quickly find out that there is an international standard for that, ISO 639, which provides two or three-letter codes. And the authoritative site on it has a list, in an HTML table. Which, after a half hour of work I got imported into a database. It was obviously written by a human, with the cells having a bunch of different inconsistent formats. But why isn't this in a consistent XML format I can pick up automatically? What if this list changes, like when next week they decide there are really a couple of more languages that need to be on the list. It is doubtful I'll ever get around to importing it again the hard way, unless somebody has a problem. So sooner or later my data will be out of whack.

And then I need a thing for selecting timezones, so I can show time in people's local format. Where's the authority for that? I can find lots of places that list the different time zones. But no easy way of knowing when they have daylight savings time. The map of who uses what system across the world is surprisingly complex. Just see Canada. But the whole thing would really be a few kilobytes of data. I just want the correct data. I can find companies selling that, for $399 per year, but that's kind of silly. .... ah, a little more research shows that the Olson TZ database built into all Unix and Linux systems is a fine solution. It isn't authoritative, but it seems to be good, and gets updated once in a while, and it is already there. I kind of knew that, I had just forgotten. I'll use that. But, really, there should be one authoritative webservice somewhere I could just call. Manned by one employee in the UN or something, who'll call somebody in each region a couple of times per year and hear if they've changed their system, and who updates the database accordingly.

There are a lot of things one could do if more data were easily available as web services in authoriative normalized versions. Population, environment, geographical, financial data. If it were all available in standard ways, I could make my own analysis of what seems to be going on in the world. As it is right now, one has to put up with third hand questionable data, and it takes quite some financing to get somebody to normalize the data so one can do things with it.
[ | 2004-08-16 15:32 | 3 comments | PermaLink ]  More >

Main Page: ming.tv