How to create users preferences vectors from clickstreams, presuming the existence of multiple distributions of the preferences

#1
Hello all. This is my first post here. I do thank you in advance for your help.

Let's imagine we do have products who are conceptually describe by 40 features, with scores between 0 and 1.

Let's imagine the first feature is 'color'. 0 would be white, 1 would be black, just for the sake of the example.

Every user's interaction (like, follow, spend time on page) is recorded, with a factor function of the meaningfulness of the action (buy has more significance than a like, for instance).

When a new product arrives on the platform, I want to match this product with the users, to suggest it to the people the most likely to be interested by it.

We plan to calculate a cosine similarity score in between a) the product vector b) the users preferences vector.

This is where we struggle. How to get a coherent b) users preferences vector?

Indeed, let's imagine that I like white products a lot (score = 0.00) and black products (score = 1.00). I don't want to use an average per feature, because if I like white and black it doesn't mean my favourite colour is grey?

So my question: how to best represents the user per vectors? Maybe not only one, but a few? How to understand if the distribution is unimodal, or multi-modal?

And for which features are the distribution of a specific user significantly different from the other users?

Many questions -- for which I found little literature to read. This is why I am turning towards this forum.

Thank you
Daniel