Document Input Compatibility?

abgup · March 28, 2024, 12:32am

Given a document of text + images, will marqo vectorizes the text within the document as well as the image? The images could be a direct picture within a document or a picture with text embedded within standard image format.

owen-elliott · April 1, 2024, 10:21pm

The fields that get vectorised will depend on the ones you specify as the tensorFields. Text and images must be contained in separate fields and images are provided via URLs. For example this document:

{
    "_id": "doc1",
    "text": "This is a field with some text in it",
    "image": "https://image.com/image.jpg"
}

Could have the text vectorised with tensorFields=["text"] or the image with tensorFields=["image"]. Or you could combine the two into a single vector with:

import marqo

mq = marqo.Client(url="http://localhost:8882")

settings = {
    "treat_urls_and_pointers_as_images": True,
    "model": "open_clip/ViT-B-32/laion2b_s34b_b79k",
}

mq.create_index("my-first-multimodal-index", **settings)

document = {
    "_id": "doc1",
    "text": "This is a field with some text in it",
    "image": "https://image.com/image.jpg"
}

mq.index("my-first-multimodal-index").add_documents(
    [document],
    tensor_fields=["text_image_field"],
    mappings={
        "text_image_field": {
            "type": "multimodal_combination",
            "weights": {"text": 0.1, "image": 0.9},
        }
    },
)

The example above uses the Python client however you could of course use the API from any language of your choice.

Topic		Replies	Views
Support for vectors as array of floats? Support	5	306	November 26, 2023
Marqo Major Release 1.0.0 Announcements	1	210	August 3, 2023
Is there a way to search by already indexed document? Support	1	253	July 19, 2023
Need some insight Support	7	199	April 1, 2024
Expecting List not String Support	7	165	March 28, 2024

Document Input Compatibility?

Related topics