Yuhan's blog: March 2024

Friday, March 29, 2024

Summary on Outlier's tutorial: Cross Attention

Self attention: we'd like to find a weighted sum of feature vectors for a series of input, whether it is a paragraph of texts or a list of image segments. Feature vector for each input vector is calculated by the Value matrix. The weight is calculated by the Query matrix (for the vector at focus) multiplying the Key matrix (for each vector in the input).

Cross attention: now consider having two inputs: one image and one text. The Query matrix applies to the image. And Key matrix and Value matrix applies to the text. Find the attention between each image input fragment to each word. Taking this value as weights, apply to the Value matrix to create the weighted sum. From this vector, then you can multiply another matrix to transform it back to the dimension of the input image to construct a (different) image.

Reference: https://www.youtube.com/watch?v=aw3H-wPuRcw

Thursday, March 28, 2024

Summary on Shusen Wang's tutorial: Item-to-Item (I2I) model

The most common usage is User-Item-Item: user has a list of favorite items. Use the favorite items to find similar items (a.k.a Item-to-Item), and recommend these items to the user.

How to measure similarity: if user likes two items, and the two items have similarity. (Similar to Item Collaborative Filtering)

Another way to measure similarity: compare items' feature vectors.

User-User-Item: User 1 is similar to User 2. User 2 likes item A. So recommend item A to User 1.

User-Author-Item: User 1 likes Author 2, and Author 2 has item A. So, recommend item A to user 1.

User-Author-Athor-Item: User 1 likes Author2.Author 2 and Author 3 are similar. So recommend Author 3's items to User 1.

Using multiple models with separate shares in retrieving records, achieves better results than using the just one model to retrieve all items.

Summary on Shusen Wang's tutorial: Boost New Items

New Items cannot compete with older item. Thus it is necessary to give new items a boost in search when they are created to guaranteed some impressions.

It is difficult to assign the boost score, and it often leads to giving too many impressions to new items. Thus limiting impressions is necessary.

One way is to vary boost scores according to the impressions that have been served. (Ex: Create a table and vary the boost score by reducing the boost value depending on how close to the deadline and how many impressions that have been served.)

Vary the number of impressions based on the predicted quality of the item - give good item more impressions.

Reference: https://www.youtube.com/watch?v=QGD-1Feq1ZQ

Summary of Shusen Wang's tutorial: DPP to improve variety in Recommendation

Determinantal Point Processes (DPP)

Let the vectors construct a volume. The larger the volume, the more variety. When the volume is low, then there are items similar to each other.

Given V to a d x k matrix, collecting k vectors of d dimensions. The volume is calculated as the determinant, which is V.transpose * V.

DPP math formula:

argmax [ log det(V.transpose * V) ]

Hulu applied DPP in recommendation system:

argmax [ a * total reward + (1- a) * log det(V.transpose * V) ]

To solve this equation, Greedy search is applied to find 1 item at a time that has a good reward while not too close to the already picked items. Hulu provided a bit more efficient solution to find the similarity part, by using Cholesky Decomposition.

Cholesky Decomposition is to make a matrix become a triangular matrix to multiply its own transpose. In Hulu's proposal, Cholesky Decomposition at each step is calculated based on Cholesky Decomposition of the previous step to avoid repeated calculation.

References:

https://www.youtube.com/watch?v=HjpJeUSekKs

https://www.youtube.com/watch?v=wi8xVHiZZr4

https://lsrs2017.files.wordpress.com/2017/08/lsrs_2017_lamingchen.pdf

Tuesday, March 12, 2024

How To Search AWS CloudWatch Log by Date?

Knowing a timestamp where an activity happens, go to your CloudWatch log and search by prefix using the date:

Locate by last event time that is larger than your timestamp. There will be many candidate logs.

Open each candidate log, search with the keyword "INIT_START" to make sure the log started before your timestamp. If it matches, then you can perform other search by keywords.

Repeat on all candidate log streams.

To locate a particular event in a log stream at a timestamp, search by keyword "END" and it will give you the start and the end of the request. Knowing the time range of the request, then you can perform further search.

Tuesday, March 05, 2024

Summary on Shusen Wang's tutorial: Look-alike Prediction Recommendation

Starting with seed users, from whom we know the users' background: age group, education level, income, etc. Find users who didn't provide much information by similarity, including User Collaborative Filtering.

Focus on the users who interacted with a new item - treat them as seed users. Take an average on the seed user to get a vector. Use this vector to spread to look-alike users (by looking up in a vector db). Note that this feature vector needs to be updated when more users are interacting with this item.

Reference: https://www.youtube.com/watch?v=pjmRo8Uzzqg