Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: db_name should return when list_connections called #2161

Open
1 task done
hasansustcse13 opened this issue Jul 2, 2024 · 4 comments · May be fixed by #2285
Open
1 task done

[Bug]: db_name should return when list_connections called #2161

hasansustcse13 opened this issue Jul 2, 2024 · 4 comments · May be fixed by #2285

Comments

@hasansustcse13
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

I have a requirement where I need to connect to different db on same address (host + port) at the same time. And I have to use existing connection if it's alive for this is the code has been written (basically written in langchain codebase)

  if given_address is not None:
            for con in connections.list_connections():
                addr = connections.get_connection_addr(con[0])
                if (
                    con[1]
                    and ("address" in addr)
                    and (addr["address"] == given_address)
                    and ("user" in addr)
                    and (addr["user"] == tmp_user)
                ):
                    logger.debug("Using previous connection: %s", con[0])
                    return con[0]

        # Generate a new connection if one doesn't exist
        alias = uuid4().hex
        try:
            connections.connect(alias=alias, **connection_args)
            logger.debug("Created new connection using: %s", alias)
            return alias
        except MilvusException as e:
            logger.error("Failed to create new connection using: %s", alias)
            raise e

Let's say collection_1 is connected to db_1 and Now I want collection_2 will be connect to db_2. As addr = connections.get_connection_addr(con[0]) don't have the db_name value the if block will be executed and provide db_1 connection which is wrong.

I believe this is single line fix when connect to milvus we should store the db_name instead of popping.
Code from Milvus:

def connect_milvus(**kwargs):
            # rest of the code
            kwargs.pop("password")
            kwargs.pop("token", None)
            kwargs.pop("db_name", "")  # the bug

            self._connected_alias[alias] = gh
            self._alias[alias] = copy.deepcopy(kwargs)

Expected Behavior

No response

Steps/Code To Reproduce behavior

No response

Environment details

- Hardware/Softward conditions (OS, CPU, GPU, Memory):
- Method of installation (Docker, or from source):
- Milvus version (v0.3.1, or v0.4.0):
- Milvus configuration (Settings you made in `server_config.yaml`):

Anything else?

No response

@hasansustcse13 hasansustcse13 added the kind/bug Something isn't working label Jul 2, 2024
@XuanYang-cn XuanYang-cn added kind/feature and removed kind/bug Something isn't working labels Jul 2, 2024
@tvvignesh
Copy link

tvvignesh commented Aug 10, 2024

@XuanYang-cn I am facing the same issue where I am unable to switch different databases for collections in langchain. It all goes to default even when db_name is specified in connection_args in langchain_milvus and debugging further it looks like its a related issue to this.

Also, noticed this PR. Would it resolve this issue? #1446

To get it working, I need to call this on every database I am connected to before running the operation so that the registry is clean and then connect back again. This looks like a bug, not a feature.

This works:

connections.disconnect("default")
connections.remove_connection("default")

connections.disconnect("db1")
connections.remove_connection("db1")

vectorstore = Milvus(
            self.embeddings,
            collection_name=collection_name,
            auto_id=True,
            connection_args={
                "user": self.user,
                "password": self.password,
                "host": self.host,
                "port": self.port,
                "db_name": self.db_name,
            },
            index_params={}
        )

If I directly do it without that, it doesn't since the existing connections array is not empty and you are always picking con[0]

@tvvignesh
Copy link

tvvignesh commented Aug 10, 2024

Also, the connection object seems global in nature - can that be made local? For eg. lets say I want to have a dict like this:

{
"db1":con1,
"db2":con2
}

This currently doesn't seem possible since all connections seem to go in a global connections object if I am not wrong

Similar stackoverflow thread here: https://stackoverflow.com/a/78144442

@tvvignesh
Copy link

tvvignesh commented Aug 21, 2024

@XuanYang-cn Was able to get it working by modifying pymilvus package - requesting to make appropriate changes in the package as well. Thanks

1st change: I had to add this condition to if in pymilvus to reuse connection by comparing db_name as well (line 346 in screenshot) in langchain_milvus package in connections.py

image

Also, I got rid of popping db_name from connections.py in pymilvus package

image

@hasansustcse13
Copy link
Author

@XuanYang-cn I believe this was just one line fix. Do you have a plan to fix it? Langchain team has made the changes which now make it worst as if I provide db name anything else rather than default it will create new connection every time.
Langchain Merged: https://github.com/langchain-ai/langchain/pull/25627/files

CC: @tvvignesh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment