cancel
Showing results for 
Search instead for 
Did you mean: 

Unstructured Data in HANA

neelesh_jain3
Contributor
0 Kudos

Is HANA Capable of Handling Unstructured Data?

Accepted Solutions (0)

Answers (1)

Answers (1)

richard_bremer
Advisor
Advisor
0 Kudos

Hi Neelesh,

present HANA releases have no exposed capability for handling unstructured data. We are presently integrating text search/analysis capabilities, including backend capabilities and a UI component, expected with the next support package.

Find a short demo video in this community at https://www.experiencesaphana.com/videos/1046

Kind regards,

Richard

--

Dr. Richard Bremer

Customer Solution Adoption (CSA), SAP AG

Former Member
0 Kudos

Hi Richard,

Is this the UI tool kit functionality now available with SPS 4?

http://help.sap.com/hana/ui_toolkit/index.html

Is it possible to upload PDF files to HANA and search through the content of the files using this option?

Thanks.

Deepu

Former Member
0 Kudos

Hi, I was able to upload PDF files to a BLOB column in a column table in HANA db with the help from Juergen Schmerder. What you need is to create a simple script using any programming language that can establish a connection thru ODBC or JDBC, like .NET, Java, etc...Here's a sample of the script that I used to upload the files, as you can see, is quite simple...


con = dbapi.connect(‘hanahost', 30015, 'SYSTEM', '********') #Open connection to SAP HANA
cur =
con.cursor() #Open a cursor

file = open('doc.pdf', 'rb') #Open file in read-only and binary
content =
file.read() #Save the content of the file in a variable

cur.execute("INSERT INTO BLOBTEST VALUES(?,?)", (2,content)) #Save the content to a table

file.close() #Close the file
cur.close() #Close the cursor
con.close() #Close the connection

Now, to be able to search within the content of the files you will need to use Fuzzy Search. Here's an example of a query that looks for the word "march" in the content of the files. The score that you will get back is a TF/IDF score (Term Frequency/Inverse Document Frequency), which means that the score will be calculated based on the number of times the word "march" is found in the content of the file, the file with the most number of matches will have the highest score.

SELECT TO_DECIMAL(SCORE(),3,2) AS score, *

FROM BLOBTEST

WHERE CONTAINS("File_Content", 'march',

FUZZY(0.5, 'textSearch=fulltext'))

ORDER BY "Year", "Month";

Thanks, Lucas.

Practice your SAP HANA™ development skills:

www.GetYourHandsOn.it

Info en Español sobre SAP HANA™:

www.HablemosHANA.com

Former Member
0 Kudos

Hi Lucas,

Now I am trying to build an text analysis app.

One function is that when user select the PDF file and click the upload button in the webpage, the program will upload the content of the PDF into a column of the HANA table.

I have read your blog before. Now I am trying to follow the steps described in your blog and comments.

http://scn.sap.com/community/developer-center/hana/blog/2013/01/03/sap-hana-text-analysis

I just wonder that how to implement this method for a fresh developer without any knowledge of script.

Is there any available mechanism embeded in HANA studio now to do this?

Thanks!