Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: hash Clear Filter

Rendezvous Hashing Explained (2020)

Rendezvous hashing is an algorithm to solve the distributed hash table problem - a common and general pattern in distributed systems. There are three parts of the problem: Keys: unique identifiers for data or workloads Values: data or workloads that consume resources Servers: entities that manage data or workloads For example, in a distributed storage system, the key might be a filename, the value is the file data, and the servers are networked data servers that collectively store all of the f

Optimizing ClickHouse for Intel's 280 core processors

This is a guest post from Jiebin Sun, Zhiguo Zhou, Wangyang Guo and Tianyou Li, performance optimization engineers at Intel Shanghai. Intel's latest processor generations are pushing the number of cores in a server to unprecedented levels - from 128 P-cores per socket in Granite Rapids to 288 E-cores per socket in Sierra Forest, with future roadmaps targeting 200+ cores per socket. These numbers multiply on multi-socket systems, such servers may consist of 400 and more cores. The paradigm of "m

UUIDv47: Store UUIDv7 in DB, emit UUIDv4 outside (SipHash-masked timestamp)

UUIDv47 - UUIDv7-in / UUIDv4-out (SipHash-masked timestamp) uuidv47 lets you store sortable UUIDv7 in your database while emitting a UUIDv4-looking façade at your API boundary. It does this by XOR-masking only the UUIDv7 timestamp field with a keyed SipHash-2-4 stream tied to the UUID’s own random bits. Header-only C (C89) · zero deps Deterministic, invertible mapping (exact round-trip) RFC-compatible version/variant bits (v7 in DB, v4 on the wire) Key-recovery resistant (SipHash-2-4, 128-b

Hashed sorting is typically faster than hash tables

Hashed sorting is typically faster than hash tables Problem statement: count the unique values in a large array of mostly-unique uint64s. Two standard approaches are: Insert into a hash table and return the number of entries. Sort the array, then count positions that differ from their predecessor. Hash tables win the interview ( O ( n ) O(n) O(n) vs O ( n log ⁡ n ) O(n \log n) O(nlogn)), but sorting is typically faster in a well-tuned implementation. This problem and its variants are the inn

Topics: hash ms radix sort µs

IRHash: Efficient Multi-Language Compiler Caching by IR-Level Hashing

Compilation caches (CCs) save time, energy, and money by avoiding redundant compilations. They are provided by means of compiler wrappers (Ccache, sccache, cHash) or native build system features (Bazel, Buck2). Conceptually, a CC pays off if the achieved savings by cache hits outweigh the extra costs for cache lookups. Thus, most techniques try to detect a cache hit early in the compilation process by hashing the (preprocessed/tokenized) source code, but hashing the AST has also been suggested t

iPhone 17 announcement imminent as alleged Apple Event ‘hashmoji’ surfaces

Update: 9to5Mac has confirmed this information. Apple is expected to officially announce its September iPhone 17 event as soon as today. An alleged “hashmoji” on X has appeared, building on that expectation. The #AppleEvent hashmoji, according to an X account dedicated to finding these, shows an Apple logo with what appears to be a thermal imaging view inside. The account further claims that the hashmoji will go live at 9 am PT/12pm ET. This might be our first look at the event theme, assumin

P-fast trie, but smaller

Previously, I wrote some sketchy ideas for what I call a p-fast trie, which is basically a wide fan-out variant of an x-fast trie. It allows you to find the longest matching prefix or nearest predecessor or successor of a query string in a set of names in O(log k) time, where k is the key length. My initial sketch was more complicated and greedy for space than necessary, so here’s a simplified revision. (“p” now stands for prefix.) A p-fast trie stores a lexicographically ordered set of names

How to prove false statements: Practical attacks on Fiat-Shamir

Randomness is a source of power. From the coin toss that decides which team gets the ball to the random keys that secure online interactions, randomness lets us make choices that are fair and impossible to predict. But in many computing applications, suitable randomness can be hard to generate. So instead, programmers often rely on things called hash functions, which swirl data around and extract some small portion in a way that looks random. For decades, many computer scientists have presumed

Computer Scientists Figure Out How to Prove Lies

Randomness is a source of power. From the coin toss that decides which team gets the ball to the random keys that secure online interactions, randomness lets us make choices that are fair and impossible to predict. But in many computing applications, suitable randomness can be hard to generate. So instead, programmers often rely on things called hash functions, which swirl data around and extract some small portion in a way that looks random. For decades, many computer scientists have presumed

Parallelizing SHA256 Calculation on FPGA

A few weeks ago, I wrote an article where I developed a hash calculator on an FPGA. Specifically, I implemented an SHA-256 calculator. This module computes the hash of a string (up to 25 bytes) in 68 clock cycles. The design leverages the parallelism of FPGAs to compute the W matrix and the recursive rounds concurrently. However, it produces only one hash every 68 clock cycles, leaving most of the FPGA underutilized during that time. In this article we are going to elevate the performance of t

Topics: 31 hash input self wire

Bloom Filters by Example

Bloom Filters by Example A Bloom filter is a data structure designed to tell you, rapidly and memory-efficiently, whether an element is present in a set. The price paid for this efficiency is that a Bloom filter is a probabilistic data structure: it tells us that the element either definitely is not in the set or may be in the set. The base data structure of a Bloom filter is a Bit Vector. Here's a small one we'll use to demonstrate: Each empty cell in that table represents a bit, and the nu

The probability of a hash collision (2022)

The probability of a hash collision Tags: probability A hash function takes arbitrarily complex input - a word, a website, an image, a human being - and maps it to a single number. This is useful for various computer science stuffs, such as data storage and cryptography. For example, let's say you want to store a book in one of N N N boxes. If you put the book in a random box, it's quite likely that you'll forget which box you picked, especially as N N N gets bigger. What you can do instead i

The Probability of a Hash Collision

The probability of a hash collision Tags: probability A hash function takes arbitrarily complex input - a word, a website, an image, a human being - and maps it to a single number. This is useful for various computer science stuffs, such as data storage and cryptography. For example, let's say you want to store a book in one of N N N boxes. If you put the book in a random box, it's quite likely that you'll forget which box you picked, especially as N N N gets bigger. What you can do instead i