Projects STRLCPY dolt Files
🤬
main
ROOT /
go /
store /
datas /
pull /
pulling.md
93 lines | ISO-8859-1 | 4 KB

Dataset pulling algorithm

The approach is to explore the chunk graph of both sink and source in order of decreasing ref-height. As the code walks, it uses the knowledge gained about which chunks are present in the sink to both prune the source-graph-walk and build up a set of hints that can be sent to a remote Database to aid in chunk validation.

Basic algorithm

  • let sink be the sink database

  • let source be the source database

  • let snkQ and srcQ be priority queues of Ref prioritized by highest Ref.height

  • let hints be a map of hash => hash

  • let reachableChunks be a set of hashes

  • let snkHdRef be the ref (of Commit) of the head of the sink dataset

  • let srcHdRef be the ref of the source Commit, which must descend from the Commit indicated by snkHdRef

  • let traverseSource(srcRef, srcQ, sink, source, reachableChunks) be

    • pop srcRef from srcQ
      • if !sink.has(srcRef)
        • let c = source.batchStore().Get(srcRef.targetHash)
        • let v = types.DecodeValue(c, source)
        • insert all child refs, cr, from v into srcQ and into reachableRefs
        • sink.batchStore().Put(c, srcRef.height, no hints)
          • (hints will all be gathered and handed to sink.batchStore at the end)
  • let traverseSink(sinkRef, snkQ, sink, hints) be

    • pop snkRef from snkQ
    • if snkRef.height > 1
      • let v = sink.readValue(snkRef.targetHash)
      • insert all child refs, cr, from v into snkQ and hints[cr] = snkRef
  • let traverseCommon(comRef, snkHdRef, snkQ, srcQ, sink, hints) be

    • pop comRef from both snkQ and srcQ
    • if comRef.height > 1
      • if comRef is a Ref of Commit
        • let v = sink.readValue(comRef.targetHash)
        • if comRef == snkHdRef
          • ignore all parent refs
          • insert each other child ref cr from v into snkQ only, set hints[cr] = comRef
        • else
          • insert each child ref cr from v into both snkQ and srcQ, set hints[cr] = comRef
  • let `pull(source, sink, srcHdRef, sinkHdRef)

    • insert snkHdRef into snkQ and srcHdRef into srcQ
    • create empty hints and reachableChunks
    • while srcQ is non-empty
      • let srcHt and snkHt be the respective heights of the top Ref in each of srcQ and snkQ
      • if srcHt > snkHt, for every srcHdRef in srcQ which is of greater height than snkHt
        • traverseSource(srcHdRef, srcQ, sink, source)
      • else if snkHt > srcHt, for every snkHdRef in snkQ which is of greater height than srcHt
        • traverseSink(snkHdRef, snkQ, sink)
      • else
        • for every comRef in which is common to snkQ and srcQ which is of height srcHt (and snkHt)
          • traverseCommon(comRef, snkHdRef, snkQ, srcQ, sink, hints)
        • for every ref in srcQ which is of height srcHt
          • traverseSource(ref, srcQ, sink, source, reachableChunks)
        • for every ref in snkQ which is of height snkHt
          • traverseSink(ref, snkQ, sink, hints)
    • for all hash in reachableChunks
      • sink.batchStore().addHint(hints[hash])

Isomorphic, but less clear, algorithm

  • let all identifiers be as above

  • let traverseSource, traverseSink, and traverseCommon be as above

  • let higherThan(refA, refB) be

    • if refA.height == refB.height
      • return refA.targetHash < refB.targetHash
    • return refA.height > refB.height
  • let `pull(source, sink, srcHdRef, sinkHdRef)

    • insert snkHdRef into snkQ and srcHdRef into srcQ
    • create empty hints and reachableChunks
    • while srcQ is non-empty
      • if sinkQ is empty
        • pop ref from srcQ
        • `traverseSource(ref, srcQ, sink, source, reachableChunks))
      • else if higherThan(head of srcQ, head of snkQ)
        • pop ref from srcQ
        • `traverseSource(ref, srcQ, sink, source, reachableChunks))
      • else if higherThan(head of snkQ, head of srcQ)
        • pop ref from snkQ
        • traverseSink(ref, snkQ, sink, hints)
      • else, heads of both queues are the same
        • pop comRef from snkQ and srcQ
        • traverseCommon(comRef, snkHdRef, snkQ, srcQ, sink, hints)
    • for all hash in reachableChunks
      • sink.batchStore().addHint(hints[hash])
Please wait...
Page is in error, reload to recover