I'll use a bit varying field with unlimited length.
To record unique visitors per day, I'll flip the bit corresponding to a user's id.
Each event type will have a separate record. It is possible to get useful information using union(&) and intersection(|) of several fields.
A field for 1 M users will required 1M bits = 125KB.
Is it reasonable to assume that this will result with a better performance and lower space and memory than using a table with a unique index?
Is there a built-in function to count bits similar to BIT_COUNT in mysql and BITCOUNT in redis?
I've found several plpgsql functions in the mailing list.
What is the best function to use for large bitstrings (>100KB)?
Thanks
CREATE OR REPLACE FUNCTION bitcount(i integer) RETURNS integer AS $$ DECLARE n integer; DECLARE amount integer; BEGIN amount := 0; FOR n IN 1..16 LOOP amount := amount + ((i >> (n-1)) & 1); END LOOP; RETURN amount; END $$ LANGUAGE plpgsql;
create or replace function bitsetlen(bit) returns int as $$ declare i int; c int; begin c:=0; for i in 1..length($1) loop if substring($1,i,1)=B'1' then c:=c+1; end if; end loop; return c; end; $$ language plpgsql;
SELECT (myBit & 1 + myBit >> 1 & 1 + myBit >> 2 & 1) AS bitCount FROM myBitTable;
SELECT LENGTH( REPLACE( CAST( B'101000000000000000000010' AS TEXT ), '0', ''));