Next Web: web 3.0, web semántica y el futuro de internet > scale

    sortFiltrar Ordenar
    2 results



    Published on 24.6.2013 by Equipo GNOSS

    Un ejemplo de explotación de datasets de museos.

    Abstract: "The large dataset made available by the Museum of Cultural History, University of Oslo, is used 
    to present broad patterns in the geographic distribution of all Stone Age finds from the museum. A set of 
    metadata is introduced to describe the precision and accuracy of the geographic information. To incorporate 
    most of the finds, the first presentation is done at the level of municipality. In a second analysis, only finds 
    with more precise location are used, and Mesolithic/Early Neolithic and Late Neolithic sites are separated. 
    This shows that a change in the distribution pattern from these large museum databases can be good starting points for analyses, a place to get new ideas and to see whether a hypothesis might be worth pursuing."





    Published on 24.6.2011 by Equipo GNOSS

    Urs Hölzle, de Google: "Al escalar, todo se puede romper"

    Interesante entrevista con Urs Hölzle, que fue primer vicepresidente de ingeniería en Google, en la que presenta una visión realista de los problemas y riesgos en una arquitectura a gran escala, como es el caso de Google.

    Aparte del titular periodístico, "At scale, everything breaks", Hölzle ofrece las siguientes pistas:

    " In fact, we're phasing out GFS in favour of the next-generation file system that is very similar, but it's not GFS anymore. It scales better and has better latency properties as well. I think three years from now we'll try to retire that because flash memory is coming and faster networks and faster CPUs are on the way and that will change how we want to do things."

    "For instance, cluster management itself or some open-source version will happen, because everyone needs it as their computation scales and their issue becomes not the management of a single machine, but the management of a whole bunch of them."

    "I think the big challenges haven't changed that much. I'd say that it's dealing with failure, because at scale everything breaks no matter what you do and you have to deal reasonably cleanly with that and try to hide it from the people actually using your system."

    "We use tapes, still, in this age because they're actually a very cost-effective way as a last resort for Gmail. The reason why we put it in is not physical data loss, but once in a blue moon you will have a bug that destroys all copies of the online data and your only protection is to have something that is not connected to the same software system, so you can go and redo it."

    "Automation is key, but it's also dangerous. You can shut down all machines automatically if you have a bug."

    "Keeping things simple and yet scalable is actually the biggest challenge. It's really, really hard. Most things don't work that well at scale, so you need to introduce some complexity, but you have to keep it down."