Calls that default to conversational task fail with a 404

#82

by ccozad - opened Apr 29

Apr 29

•

tl;dr; The HuggingFaceInferenceAPI class in LlamaIndex calls the conversational API and this causes a 404. A work around is to add a task="text-generation" parameter to force the library to use a valid task name. This may be a problem in other areas that default to the conversational task

I ran into an issue in the "Components of LlamaIndex" notebook where the section that has you create a VectorStoreIndex in LlamaIndex and then use it (index.as_query_engine(...) and then query_engine.query(...))that then throws a 404 not found exception like:

huggingface_hub.errors.HfHubHTTPError: 404 Client Error: Not Found for url: /static-proxy?url=https%3A%2F%2Frouter.huggingface.co%2Fhf-inference%2Fmodels%2FQwen%2FQwen2.5-Coder-32B-Instruct%2Fv1%2Fchat%2Fcompletions%3C%2Fcode%3E%3C%2Fp%3E

Based on web searches, conversational tasks were deprecated in 2024. It looks like the task was perhaps finally removed recently?


The relevant packages in LlamaIndex state the following
class HuggingFaceInferenceAPI(FunctionCallingLLM):
    """
    Wrapper on the Hugging Face's Inference API.

    Overview of the design:
    - Synchronous uses InferenceClient, asynchronous uses AsyncInferenceClient
    - chat uses the conversational task: https://huggingface.co/tasks/conversational
    - complete uses the text generation task: https://huggingface.co/tasks/text-generation

    Note: some models that support the text generation task can leverage Hugging
    Face's optimized deployment toolkit called text-generation-inference (TGI).
    Use InferenceClient.get_model_status to check if TGI is being used.

    Relevant links:
    - General Docs: https://huggingface.co/docs/api-inference/index
    - API Docs: https://huggingface.co/docs/huggingface_hub/main/en/package_reference/inference_client
    - Source: https://github.com/huggingface/huggingface_hub/tree/main/src/huggingface_hub/inference
    """

The HuggingFaceInferenceAPI can be called with a task parameter to workaround this issue like task="text-generation" to force the library to use a valid task name.
The full call should look like the following:
llm = HuggingFaceInferenceAPI(
    token=hf_token, 
    model="Qwen/Qwen2.5-Coder-32B-Instruct",
    task="text-generation"
)

or like so if using a notebook with the HF token set earlier:
llm = HuggingFaceInferenceAPI(
    model="Qwen/Qwen2.5-Coder-32B-Instruct",
    task="text-generation"
)



		

			

			




	




	
						ccozad
	
				
				
				
				
				
			Apr 29
			
			
		
				
	
	
	
	

	
	Relevant issue on the LlamaIndex side: https://github.com/run-llama/llama_index/issues/18547




	
	





	

	
		
			Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
			
			Tap or paste here to upload images
			
	

	


		

		

		

			·
					Sign up or
					log in to comment