I recently joined a start up and have been tasked with setting up package management for our internal python libraries. We work in the biotech and ml space, and a lot of the packages we use are index on conda channels.
The current setup we have right now is to install our local repositories from GitHub, which are built with setup.py. Initially I thought that we should just use poetry for all of our package management, and for any of our own private libraries, have a aws codebuild/artifact setup to host our libraries. I still think this seems like the best option for doing package management in python.
On the other hand, there are too many packages that we need that are available in conda ecosystem that are not available on pypi. We've noticed now that several dependencies have clashed between pip and conda when trying to use both at the same time. So we might as well lean into using the conda ecosystem completely.
In order to do this, I think that a good idea would be to use a private conda channel for any of our own libraries, and use conda-forge for any repos that we might need from pip. If for some reason we can't find a package on conda-forge, there seems to be a pretty easy process to follow to get it there from pypi.
My question is the following: Has anyone hosted their own conda channel before?
I haven't really found any reports/guides for the standard way of doing this, which I find really surprising, because I can't be the only one running into this. As far as I can tell, artifactory is the most enterprise ready solution that is available in order to do this, but I'm curious if there's something I haven't seen before, and whether others have ran into this problem as well.
I've seen and tried options from:
The only way to use the AWS and Azure solutions is to locally mount the files from s3 in order to use the channel correctly, this just does not seem like the right way to use a conda channel, not to mention it involves downloading all/most of the files in the bucket in order to properly index the channel.
The anaconda and quertz solutions seem like a step up from mounting the s3 buckets, but they don't allow for federated logins, at least not natively, which leaves using something artifactory or some equivalent tool.
Here is what the environment.yml file would look like for what we currently do.
name: package
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- awscli=1.27.134
- pip
- python=3.10.11
- pip:
- GitHub.com/internal_repo/version/files.tar.gz
- other pip dependencies
This is a repost of the following posts: https://www.reddit.com/r/Python/comments/15bw2gn/python_packaging_with_conda/ https://www.reddit.com/r/learnpython/comments/15bw310/python_packaging_with_conda/
Other resources I've checked: